linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Oliver O'Halloran	7990102446	powerpc/timer: Large Decrementer support Power ISAv3 adds a large decrementer (LD) mode which increases the size of the decrementer register. The size of the enlarged decrementer register is between 32 and 64 bits with the exact size being dependent on the implementation. When in LD mode, reads are sign extended to 64 bits and a decrementer exception is raised when the high bit is set (i.e the value goes below zero). Writes however are truncated to the physical register width so some care needs to be taken to ensure that the high bit is not set when reloading the decrementer. This patch adds support for using the LD inside the host kernel on processors that support it. When LD mode is supported firmware will supply the ibm,dec-bits property for CPU nodes to allow the kernel to determine the maximum decrementer value. Enabling LD mode is a hypervisor privileged operation so the kernel can only enable it manually when running in hypervisor mode. Guests that support LD mode can request it using the "ibm,client-architecture-support" firmware call (not implemented in this patch) or some other platform specific method. If this property is not supplied then the traditional decrementer width of 32 bit is assumed and LD mode will not be enabled. This patch was based on initial work by Jack Miller. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-05 23:58:53 +10:00
Andrew Donnellan	a9862c7440	powerpc/rtas: Fix array overrun in ppc_rtas() syscall If ppc_rtas() is called with args.nargs == 16 and args.nret == 0, args.rets is set to point to &args.args[16], which is beyond the end of the args.args array. This results in a minor read overrun of the array when we check the first return code (which, per PAPR, is a required output of all RTAS calls) to see if there's been a hardware error. Change the nargs/nret check to ensure nargs is <= 15, allowing room for the status code. Users shouldn't be calling with nret == 0, but there's no real harm if they do, so we don't stop them. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-05 23:49:52 +10:00
Chris Smart	ae26b36f80	powerpc: Send SIGBUS on unaligned copy and paste Calling ISA 3.0 instructions copy, copy_first, paste and paste_last generates an alignment fault when copying or pasting unaligned data (128 byte). We catch this and send SIGBUS to the userspace process that caused it. We do not emulate these because paste may contain additional metadata when pasting to a co-processor and paste_last is the synchronisation point for preceding copy/paste sequences. Thanks to Michael Neuling <mikey@neuling.org> for his help. Signed-off-by: Chris Smart <chris@distroguy.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-05 23:49:51 +10:00
Madhavan Srinivasan	393eb79ad3	powerpc/perf: factor out power8 __init_pmu code Factor out the power8 pmu init functions to share with power9. Monitor Mode Control Register S(MMCRS) and Monitor Mode Control Register H(MMCRH) registers are dropped in Power9. These registers are added to new function which are included for power8 init. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-05 23:49:47 +10:00
Michael Ellerman	b5b1cfc5d4	powerpc/fadump: Fix build error introduced by recent cleanup We spent so much time bike-shedding the printk() we missed that the next line was missing a semi-colon. And it seems none of our defconfigs turn on CONFIG_FA_DUMP. Fixes: `4a03749f14` ("powerpc/fadump: Trivial fix of spelling mistake, clean up message") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-05 23:49:46 +10:00
Darren Stevens	bfa37087aa	powerpc: Initialise pci_io_base as early as possible Commit `d6a9996e84` ("powerpc/mm: vmalloc abstraction in preparation for radix") turned kernel memory and IO addresses from #defined constants to variables initialised at runtime. On PA6T (pasemi) systems the setup_arch() machine call initialises the onboard PCI-e root-ports, and uses pci_io_base to do this, which is now before its value has been set, resulting in a panic early in boot before console IO is initialised. Move the pci_io_base initialisation to the same place as vmalloc ranges are set (hash__early_init_mmu()/radix__early_init_mmu()) - this is the earliest possible place we can initialise it. Fixes: `d6a9996e84` ("powerpc/mm: vmalloc abstraction in preparation for radix") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Darren Stevens <darren@stevens-zone.net> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Add #ifdef CONFIG_PCI, massage change log slightly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-30 16:52:29 +10:00
Michael Neuling	190ce8693c	powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 Currently we have 2 segments that are bolted for the kernel linear mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel stacks. Anything accessed outside of these regions may need to be faulted in. (In practice machines with TM always have 1T segments) If a machine has < 2TB of memory we never fault on the kernel linear mapping as these two segments cover all physical memory. If a machine has > 2TB of memory, there may be structures outside of these two segments that need to be faulted in. This faulting can occur when running as a guest as the hypervisor may remove any SLB that's not bolted. When we treclaim and trecheckpoint we have a window where we need to run with the userspace GPRs. This means that we no longer have a valid stack pointer in r1. For this window we therefore clear MSR RI to indicate that any exceptions taken at this point won't be able to be handled. This means that we can't take segment misses in this RI=0 window. In this RI=0 region, we currently access the thread_struct for the process being context switched to or from. This thread_struct access may cause a segment fault since it's not guaranteed to be covered by the two bolted segment entries described above. We've seen this with a crash when running as a guest with > 2TB of memory on PowerVM: Unrecoverable exception 4100 at c00000000004f138 Oops: Unrecoverable exception, sig: 6 [#1] SMP NR_CPUS=2048 NUMA pSeries CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G X 4.4.13-46-default #1 task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000 NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20 REGS: c000189001d5f730 TRAP: 4100 Tainted: G X (4.4.13-46-default) MSR: 8000000100001031 <SF,ME,IR,DR,LE> CR: 24000048 XER: 00000000 CFAR: c00000000004ed18 SOFTE: 0 GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288 GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620 GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0 GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211 GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110 GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050 GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30 GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680 NIP [c00000000004f138] restore_gprs+0xd0/0x16c LR [0000000010003a24] 0x10003a24 Call Trace: [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable) [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0 [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350 [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0 [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0 [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0 [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130 [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4 Instruction dump: 7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6 38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098 ---[ end trace 602126d0a1dedd54 ]--- This fixes this by copying the required data from the thread_struct to the stack before we clear MSR RI. Then once we clear RI, we only access the stack, guaranteeing there's no segment miss. We also tighten the region over which we set RI=0 on the treclaim() path. This may have a slight performance impact since we're adding an mtmsr instruction. Fixes: `090b9284d7` ("powerpc/tm: Clear MSR RI in non-recoverable TM code") Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-29 16:19:25 +10:00
Gavin Shan	cca0e542e0	powerpc/eeh: Fix wrong argument passed to eeh_rmv_device() When calling eeh_rmv_device() in eeh_reset_device() for partial hotplug case, @rmv_data instead of its address is the proper argument. Otherwise, the stack frame is corrupted when writing to @rmv_data (actually its address) in eeh_rmv_device(). It results in kernel crash as observed. This fixes the issue by passing @rmv_data, not its address to eeh_rmv_device() in eeh_reset_device(). Fixes: `67086e32b5` ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE") Reported-by: Pridhiviraj Paidipeddi <ppaidipe@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-28 20:47:49 +10:00
Colin Ian King	4a03749f14	powerpc/fadump: Trivial fix of spelling mistake, clean up message Fix trivial spelling mistake "rgistration". Also use pr_err() instead of printk() and unsplit the string to keep it all on one line. Signed-off-by: Colin Ian King <colin.king@canonical.com> [mpe: Keep rc on the same line, splitting it doesn't help] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-28 13:50:47 +10:00
Cyril Bur	8e96a87c54	powerpc/tm: Always reclaim in start_thread() for exec() class syscalls Userspace can quite legitimately perform an exec() syscall with a suspended transaction. exec() does not return to the old process, rather it load a new one and starts that, the expectation therefore is that the new process starts not in a transaction. Currently exec() is not treated any differently to any other syscall which creates problems. Firstly it could allow a new process to start with a suspended transaction for a binary that no longer exists. This means that the checkpointed state won't be valid and if the suspended transaction were ever to be resumed and subsequently aborted (a possibility which is exceedingly likely as exec()ing will likely doom the transaction) the new process will jump to invalid state. Secondly the incorrect attempt to keep the transactional state while still zeroing state for the new process creates at least two TM Bad Things. The first triggers on the rfid to return to userspace as start_thread() has given the new process a 'clean' MSR but the suspend will still be set in the hardware MSR. The second TM Bad Thing triggers in __switch_to() as the processor is still transactionally suspended but __switch_to() wants to zero the TM sprs for the new process. This is an example of the outcome of calling exec() with a suspended transaction. Note the first 700 is likely the first TM bad thing decsribed earlier only the kernel can't report it as we've loaded userspace registers. c000000000009980 is the rfid in fast_exception_return() Bad kernel stack pointer 3fffcfa1a370 at c000000000009980 Oops: Bad kernel stack pointer, sig: 6 [#1] CPU: 0 PID: 2006 Comm: tm-execed Not tainted NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000 REGS: c00000003ffefd40 TRAP: 0700 Not tainted MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]> CR: 00000000 XER: 00000000 CFAR: c0000000000098b4 SOFTE: 0 PACATMSCRATCH: b00000010000d033 GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000 GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000 NIP [c000000000009980] fast_exception_return+0xb0/0xb8 LR [0000000000000000] (null) Call Trace: Instruction dump: f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070 e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b Kernel BUG at c000000000043e80 [verbose debug info unavailable] Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033) Oops: Unrecoverable exception, sig: 6 [#2] CPU: 0 PID: 2006 Comm: tm-execed Tainted: G D task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000 NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000 REGS: c00000003ffef7e0 TRAP: 0700 Tainted: G D MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]> CR: 28002828 XER: 00000000 CFAR: c000000000015a20 SOFTE: 0 PACATMSCRATCH: b00000010000d033 GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000 GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000 GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004 GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000 GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000 GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80 NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c LR [c000000000015a24] __switch_to+0x1f4/0x420 Call Trace: Instruction dump: 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020 This fixes CVE-2016-5828. Fixes: `bc2a9408fa` ("powerpc: Hook in new transactional memory code") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-27 20:35:17 +10:00
Benjamin Herrenschmidt	cdb1b3424d	powerpc/pci: Reduce log level of PCI I/O space warning If a PHB has no I/O space, there's no need to make it look like something bad happened, a pr_debug() is plenty enough since this is the case of all our modern POWER chips. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-24 15:26:31 +10:00
Michael Ellerman	6e914ee629	powerpc: Fix faults caused by radix patching of SLB miss handler As part of the Radix MMU support we added some feature sections in the SLB miss handler. These are intended to catch the case that we incorrectly take an SLB miss when Radix is enabled, and instead of crashing weirdly they bail out to a well defined exit path and trigger an oops. However the way they were written meant the bailout case was enabled by default until we did CPU feature patching. On powermacs the early debug prints in setup_system() can cause an SLB miss, which happens before code patching, and so the SLB miss handler would incorrectly bailout and crash during boot. Fix it by inverting the sense of the feature section, so that the code which is in place at boot is correct for the hash case. Once we determine we are using Radix - which will never happen on a powermac - only then do we patch in the bailout case which unconditionally jumps. Fixes: `caca285e5a` ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code") Reported-by: Denis Kirjanov <kda@linux-powerpc.org> Tested-by: Denis Kirjanov <kda@linux-powerpc.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-23 09:58:17 +10:00
Gavin Shan	8cc7581cdb	powerpc/pci: Delay populating pdn The pdn (struct pci_dn) instances are allocated from memblock or bootmem when creating PCI controller (hoses) in setup_arch(). PCI hotplug, which will be supported by proceeding patches, releases PCI device nodes and their corresponding pdn on unplugging event. The memory chunks for pdn instances allocated from memblock or bootmem are hard to reused after being released. This delays creating pdn by pci_devs_phb_init() from setup_arch() to core_initcall() so that they are allocated from slab. The memory consumed by pdn can be released to system without problem during PCI unplugging time. It indicates that pci_dn is unavailable in setup_arch() and the the fixup on pdn (like AGP's) can't be carried out that time. We have to do that in pcibios_root_bridge_prepare() on maple/pasemi/powermac platforms where/when the pdn is available. pcibios_root_bridge_prepare is called from subsys_initcall() which is executed after core_initcall() so the code flow does not change. At the mean while, the EEH device is created when pdn is populated, meaning pdn and EEH device have same life cycle. In turn, we needn't call eeh_dev_init() to create EEH device explicitly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-21 15:30:56 +10:00
Gavin Shan	7415c14c56	powerpc/pci: Update bridge windows on PCI plug On the PCI plugging event, PCI slot's subordinate devices are scanned and their (IO and MMIO) resources are assigned. Platform dependent resources (PE#, IO/MMIO/DMA windows) are allocated or created on updating windows of the slot's upstream bridge. This updates the windows of the hot plugged slot's upstream bridge in pcibios_finish_adding_to_bus() so that the platform resources (PE#, IO/MMIO/DMA segments) are allocated or created accordingly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-21 15:30:56 +10:00
Gavin Shan	c5fcb29a64	powerpc/pci: Override pcibios_setup_bridge() This overrides pcibios_setup_bridge() that is called to update PCI bridge windows when PCI resource assignment is completed, to assign PE and setup various (resource) mapping for the PE in subsequent patches. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-21 15:30:52 +10:00
Mauricio Faria de Oliveira	f8ab481066	powerpc: export cpu_to_core_id() Export cpu_to_core_id(). This will be used by the lpfc driver. This enables topology_core_id() from <linux/topology.h> (defined to cpu_to_core_id() in arch/powerpc/include/asm/topology.h) to be used by (non-builtin) modules. That is arch-neutral, already used by eg, drivers/base/topology.c, but it is builtin (obj-y in Makefile) thus didn't need the export. Since the module uses topology_core_id() and this is defined to cpu_to_core_id(), it needs the export, otherwise: ERROR: "cpu_to_core_id" [drivers/scsi/lpfc/lpfc.ko] undefined! Tested on next-20160601. Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-21 15:30:51 +10:00
Jack Miller	bd3ea317fd	powerpc: Load Monitor Register Support This enables new registers, LMRR and LMSER, that can trigger an EBB in userspace code when a monitored load (via the new ldmx instruction) loads memory from a monitored space. This facility is controlled by a new FSCR bit, LM. This patch disables the FSCR LM control bit on task init and enables that bit when a load monitor facility unavailable exception is taken for using it. On context switch, this bit is then used to determine whether the two relevant registers are saved and restored. This is done lazily for performance reasons. Signed-off-by: Jack Miller <jack@codezen.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-21 15:30:50 +10:00
Michael Neuling	b57bd2de8c	powerpc: Improve FSCR init and context switching This fixes a few issues with FSCR init and switching. In commit `152d523e63` ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") we moved the setting of the FSCR register from inside an CPU_FTR_ARCH_207S section to inside just a CPU_FTR_ARCH_DSCR section. Hence we are setting FSCR on POWER6/7 where the FSCR doesn't exist. This is harmless but we shouldn't do it. Also, we can simplify the FSCR context switch. We don't need to go through the calculation involving dscr_inherit. We can just restore what we saved last time. We also set an initial value in INIT_THREAD, so that pid 1 which is cloned from that gets a sane value. Based on patch by Jack Miller. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-21 15:30:50 +10:00
Madhavan Srinivasan	103b7827d9	powerpc: Fix misleading comment in early_setup_secondary() Current comment in the early_setup_secondary() for paca->soft_enabled update is misleading. Comment should say to Mark interrupts "disabled" instead of "enabled". Fix the typo. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-21 15:30:49 +10:00
Thiago Jung Bauermann	61ed9cfb1b	powerpc/kprobes: Remove kretprobe_trampoline_holder. Fixes the following testsuite failure: $ sudo ./perf test -v kallsyms 1: vmlinux symtab matches kallsyms : --- start --- test child forked, pid 12489 Using /proc/kcore for kernel object code Looking at the vmlinux_path (8 entries long) Using /boot/vmlinux for symbols 0xc00000000003d300: diff name v: .kretprobe_trampoline_holder k: kretprobe_trampoline Maps only in vmlinux: c00000000086ca38-c000000000879b6c 87ca38 [kernel].text.unlikely c000000000879b6c-c000000000bf0000 889b6c [kernel].meminit.text c000000000bf0000-c000000000c53264 c00000 [kernel].init.text c000000000c53264-d000000004250000 c63264 [kernel].exit.text d000000004250000-d000000004450000 0 [libcrc32c] d000000004450000-d000000004620000 0 [xfs] d000000004620000-d000000004680000 0 [autofs4] d000000004680000-d0000000046e0000 0 [x_tables] d0000000046e0000-d000000004780000 0 [ip_tables] d000000004780000-d0000000047e0000 0 [rng_core] d0000000047e0000-ffffffffffffffff 0 [pseries_rng] Maps in vmlinux with a different name in kallsyms: Maps only in kallsyms: d000000000000000-f000000000000000 1000000000010000 [kernel.kallsyms] f000000000000000-ffffffffffffffff 3000000000010000 [kernel.kallsyms] test child finished with -1 ---- end ---- vmlinux symtab matches kallsyms: FAILED! The problem is that the kretprobe_trampoline symbol looks like this: $ eu-readelf -s /boot/vmlinux G kretprobe_trampoline 2431: c000000001302368 24 NOTYPE LOCAL DEFAULT 37 kretprobe_trampoline_holder 2432: c00000000003d300 8 FUNC LOCAL DEFAULT 1 .kretprobe_trampoline_holder 97543: c00000000003d300 0 NOTYPE GLOBAL DEFAULT 1 kretprobe_trampoline Its type is NOTYPE, and its size is 0, and this is a problem because symbol-elf.c:dso__load_sym skips function symbols that are not STT_FUNC or STT_GNU_IFUNC (this is determined by elf_sym__is_function). Even if the type is changed to STT_FUNC, when dso__load_sym calls symbols__fixup_duplicate, the kretprobe_trampoline symbol is dropped in favour of .kretprobe_trampoline_holder because the latter has non-zero size (as determined by choose_best_symbol). With this patch, all vmlinux symbols match /proc/kallsyms and the testcase passes. Commit `c1c355ce14` ("x86/kprobes: Get rid of kretprobe_trampoline_holder()") gets rid of kretprobe_trampoline_holder altogether on x86. This commit does the same on powerpc. This change introduces no regressions on the perf and ftracetest testsuite results. Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-21 15:30:49 +10:00
Mahesh Salgaonkar	fd7bacbca4	KVM: PPC: Book3S HV: Fix TB corruption in guest exit path on HMI interrupt When a guest is assigned to a core it converts the host Timebase (TB) into guest TB by adding guest timebase offset before entering into guest. During guest exit it restores the guest TB to host TB. This means under certain conditions (Guest migration) host TB and guest TB can differ. When we get an HMI for TB related issues the opal HMI handler would try fixing errors and restore the correct host TB value. With no guest running, we don't have any issues. But with guest running on the core we run into TB corruption issues. If we get an HMI while in the guest, the current HMI handler invokes opal hmi handler before forcing guest to exit. The guest exit path subtracts the guest TB offset from the current TB value which may have already been restored with host value by opal hmi handler. This leads to incorrect host and guest TB values. With split-core, things become more complex. With split-core, TB also gets split and each subcore gets its own TB register. When a hmi handler fixes a TB error and restores the TB value, it affects all the TB values of sibling subcores on the same core. On TB errors all the thread in the core gets HMI. With existing code, the individual threads call opal hmi handle independently which can easily throw TB out of sync if we have guest running on subcores. Hence we will need to co-ordinate with all the threads before making opal hmi handler call followed by TB resync. This patch introduces a sibling subcore state structure (shared by all threads in the core) in paca which holds information about whether sibling subcores are in Guest mode or host mode. An array in_guest[] of size MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore. The subcore id is used as index into in_guest[] array. Only primary thread entering/exiting the guest is responsible to set/unset its designated array element. On TB error, we get HMI interrupt on every thread on the core. Upon HMI, this patch will now force guest to vacate the core/subcore. Primary thread from each subcore will then turn off its respective bit from the above bitmap during the guest exit path just after the guest->host partition switch is complete. All other threads that have just exited the guest OR were already in host will wait until all other subcores clears their respective bit. Once all the subcores turn off their respective bit, all threads will will make call to opal hmi handler. It is not necessary that opal hmi handler would resync the TB value for every HMI interrupts. It would do so only for the HMI caused due to TB errors. For rest, it would not touch TB value. Hence to make things simpler, primary thread would call TB resync explicitly once for each core immediately after opal hmi handler instead of subtracting guest offset from TB. TB resync call will restore the TB with host value. Thus we can be sure about the TB state. One of the primary threads exiting the guest will take up the responsibility of calling TB resync. It will use one of the top bits (bit 63) from subcore state flags bitmap to make the decision. The first primary thread (among the subcores) that is able to set the bit will have to call the TB resync. Rest all other threads will wait until TB resync is complete. Once TB resync is complete all threads will then proceed. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-06-20 14:11:25 +10:00
Bjorn Helgaas	3830135857	powerpc/pci: Implement pci_resource_to_user() with pcibios_resource_to_bus() "User" addresses are shown in /sys/devices/pci.../.../resource and /proc/bus/pci/devices and used as mmap offsets for /proc/bus/pci/BB/DD.F files. For I/O port resources on powerpc, these are PCI bus addresses, i.e., raw BAR values. Previously pci_resource_to_user() computed the user address by subtracting "hose->io_base_virt - _IO_BASE" from the resource start: pci_resource_to_user() if (IO) offset = (unsigned long)hose->io_base_virt - _IO_BASE; start = rsrc->start - offset; We've already told the PCI core about that "hose->io_base_virt - _IO_BASE" offset: pcibios_setup_phb_resources() res = &hose->io_resource; offset = pcibios_io_space_offset(); / i.e., "offset = hose->io_base_virt - _IO_BASE" */ pci_add_resource_offset(resources, res, offset); so pcibios_resource_to_bus() knows how to do that translation. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Yinghai Lu <yinghai@kernel.org>	2016-06-17 14:44:44 -05:00
Yinghai Lu	1e70cdd6e6	powerpc/pci: Remove __pci_mmap_set_pgprot() The powerpc-specific __pci_mmap_set_pgprot() does two things: 1) Disables write combining for I/O port space mappings This only affects procfs mappings. The pci_mmap_resource() sysfs path only requests write combining for resources with IORESOURCE_PREFETCH set, which doesn't include I/O resources. The only way to request write combining for I/O port space mappings was via the PCIIOC_WRITE_COMBINE ioctl and the proc_bus_pci_mmap() path, and we recently changed that path to ignore write combining for I/O, so this code in powerpc is no longer needed. 2) Automatically enables write combining for mappings of prefetchable resources, even if not requested by the user Both procfs (via PCIIOC_MMAP_IS_MEM and PCIIOC_WRITE_COMBINE ioctls) and sysfs (via "resourceN_wc" files, which are created for resources with IORESOURCE_PREFETCH) provide ways for the user to map PCI memory space with write combining. Users that desire write combining should use one of those ways instead of relying on powerpc-specific behavior. Remove the powerpc-specific __pci_mmap_set_pgprot(). The user-visible effect of this change is that powerpc users mapping prefetchable PCI memory space via procfs without PCIIOC_WRITE_COMBINE or via sysfs "resourceN" (not "resourceN_wc") will get regular uncacheable mappings instead of the write combining mappings they used to get. The new behavior matches the behavior on all other arches that support write combining mapping. [bhelgaas: changelog] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2016-06-17 14:43:33 -05:00
Gavin Shan	a3aa256b72	powerpc/eeh: Fix invalid cached PE primary bus The PE primary bus cannot be got from its child devices when having full hotplug in error recovery. The PE primary bus is cached, which is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually. This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset()) can get valid PE primary bus through eeh_pe_bus_get(). Fixes: `67086e32b5` ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE") Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-17 19:51:47 +10:00
Russell Currey	fb36e90736	powerpc/pci: Fix SRIOV not building without EEH enabled On Book3E CPUs (and possibly other configs), it is possible to have SRIOV (CONFIG_PCI_IOV) set without CONFIG_EEH. The SRIOV code does not check for this, and if EEH is disabled, pci_dn.c fails to build. Fix this by gating the EEH-specific code in the SRIOV implementation behind CONFIG_EEH. Fixes: `39218cd0` ("powerpc/eeh: EEH device for VF") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-17 16:52:22 +10:00
Daniel Axtens	a9650e9bc5	powerpc/align: Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE Sparse complains that it doesn't know what REG_BYTE is: arch/powerpc/kernel/align.c:313:29: error: undefined identifier 'REG_BYTE' REG_BYTE is defined differently based on whether we're compiling for LE, BE32 or BE64. Sparse apparently doesn't provide __BIG_ENDIAN__ or __LITTLE_ENDIAN__, which means we get no definition. Rather than check for __BIG_ENDIAN__ and then separately for __LITTLE_ENDIAN__, just switch the #ifdef to check for __BIG_ENDIAN__ and then #else we define the little endian version. Technically that's dicey because PDP_ENDIAN is also a possibility, but we already do it in a lot of places so one more hardly matters. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 22:40:19 +10:00
Daniel Axtens	665e87ffe1	powerpc/sparse: Include headers containing prototypes Sometimes headers that provide prototypes for functions are accidentally omitted from the files that define the functions. Fix a couple of times that occurs. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 22:40:19 +10:00
Daniel Axtens	42f5b4cacd	powerpc: Introduce asm-prototypes.h Sparse picked up a number of functions that are implemented in C and then only referred to in asm code. This introduces asm-prototypes.h, which provides a place for prototypes of these functions. This silences some sparse warnings. Signed-off-by: Daniel Axtens <dja@axtens.net> [mpe: Add include guards, clean up copyright & GPL text] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 22:39:54 +10:00
Daniel Axtens	34852ed551	powerpc/sparse: make some things static This is just a smattering of things picked up by sparse that should be made static. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 22:23:11 +10:00
Suraj Jitindar Singh	1d1451655b	powerpc: Add array bounds checking to crash_shutdown_handlers The array crash_shutdown_handles is an array of size CRASH_HANDLER_MAX+1 containing up to CRASH_HANDLER_MAX shutdown_handlers. It is assumed to be NULL terminated, which it is under normal circumstances. Array accesses in the functions crash_shutdown_unregister() and default_machine_crash_shutdown() rely on this NULL termination property when traversing this list and don't protect again out of bounds accesses. If the NULL terminator were somehow overwritten these functions could potentially access out of the bounds of the array. Shrink the array to size CRASH_HANDLER_MAX and implement explicit array bounds checking when accessing the elements of the crash_shutdown_handles[] array in crash_shutdown_unregister() and default_machine_crash_shutdown(). Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 16:05:47 +10:00
Rashmica Gupta	aac6a91fea	powerpc/asm: Remove unused symbols in asm-offsets.c THREAD_DSCR: Added in `efcac6589a` "powerpc: Per process DSCR + some fixes (try#4)" Last usage removed in `152d523e63` "powerpc: Create context switch helpers save_sprs() and restore_sprs()" THREAD_DSCR_INHERIT: Added in `714332858b` "powerpc: Restore correct DSCR in context switch" Last usage removed in `152d523e63` "powerpc: Create context switch helpers save_sprs() and restore_sprs()" THREAD_TAR: Added in `2468dcf641` "powerpc: Add support for context switching the TAR register" Last usage removed in `152d523e63` "powerpc: Create context switch helpers save_sprs() and restore_sprs()" THREAD_BESCR, THREAD_EBBHR and THREAD_EBBRR: Added in `9353374b8e` "powerpc: Context switch the new EBB SPRs" Last usage removed in `152d523e63` "powerpc: Create context switch helpers save_sprs() and restore_sprs()" THREAD_SIAR, THREAD_SDAR, THREAD_SIER, THREAD_MMCR0, and THREAD_MMCR2: Added in `59affcd3e4` "powerpc: Context switch more PMU related SPRs" Last usage removed in `b11ae95100` "powerpc: Partial revert of "Context switch more PMU related SPRs"" PACA_LOCK_TOKEN: Added in `9e368f2915` "KVM: PPC: book3s_hv: Add support for PPC970-family processors" Last usage removed in `c17b98cf60` "KVM: PPC: Book3S HV: Remove code for PPC970 processors" HCALL_STAT_SIZE, HCALL_STAT_CALLS, HCALL_STAT_TB and HCALL_STAT_PURR: Added in `57852a853b` "[POWERPC] powerpc: Instrument Hypervisor Calls" Last usage removed in `c8cd093a6e` "powerpc: tracing: Add hypervisor call tracepoints" VCPU_EPLC: Added in `d30f6e4800` "KVM: PPC: booke: category E.HV (GS-mode) support" Never used. CPU_DOWN_FLUSH: Added in `e7affb1dba` "powerpc/cache: add cache flush operation for various e500" Never used. CFG_STAMP_XSEC: Added in `14cf11af6c` "powerpc: Merge enough to start building in arch/powerpc." Last usage removed in `0e469db8f7` "powerpc: Rework VDSO gettimeofday to prevent time going backwards" KVM_LPCR: Added in `aa04b4cc5b` "KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests" Last usage removed in `a0144e2a6b` "KVM: PPC: Book3S HV: Store LPCR value for each virtual core" GPR15, GPR16, GPR17, GPR18, GPR19, GPR20, GPR21, GPR22, GPR23, GPR24, GPR25, GPR26, GPR27, GPR28, GPR29, GPR30 and GPR31: Added in `14cf11af6c` "powerpc: Merge enough to start building in arch/powerpc." Never used. VCPU_SHADOW_FSCR: Added in `616dff8602` "KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR" Never used. VCPU_SHADOW_SRR1: Added in `a2d56020d1` "KVM: PPC: Book3S PR: Keep volatile reg values in vcpu rather than shadow_vcpu" Never used. KVM_SPLIT_SIZE: Added in `b4deba5c41` "KVM: PPC: Book3S HV: Implement dynamicmicro-threading on POWER8" Never used. VCPU_VCPUID: Added in `de56a948b9` "KVM: PPC: Add support for Book3S processors in hypervisor mode" Last usage removed `1b400ba0cd` "KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations" _MQ: Added in `14cf11af6c` "powerpc: Merge enough to start building in arch/powerpc." Never used. AUDITCONTEXT: Added in `14cf11af6c` "powerpc: Merge enough to start building in arch/powerpc." Last usage removed in `401d1f029b` "[PATCH] syscall entry/exit revamp" CLONE_VM: Added in `14cf11af6c` "powerpc: Merge enough to start building in arch/powerpc." Currently unused. CLONE_UNTRACED: Added in `14cf11af6c` "powerpc: Merge enough to start building in arch/powerpc." Currently unused. Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> [mpe: Munge change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 15:11:25 +10:00
Kees Cook	1addc57e11	powerpc/ptrace: run seccomp after ptrace Close the hole where ptrace can change a syscall out from under seccomp. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org	2016-06-14 10:54:46 -07:00
Andy Lutomirski	2f275de5d1	seccomp: Add a seccomp_data parameter secure_computing() Currently, if arch code wants to supply seccomp_data directly to seccomp (which is generally much faster than having seccomp do it using the syscall_get_xyz() API), it has to use the two-phase seccomp hooks. Add it to the easy hooks, too. Cc: linux-arch@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>	2016-06-14 10:54:39 -07:00
Greg Kurz	6e45273eac	powerpc/pseries: Fix trivial typo in function name Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-14 16:05:35 +10:00
Michael Ellerman	f55d966536	powerpc: Define and use PPC64_ELF_ABI_v2/v1 We're approaching 20 locations where we need to check for ELF ABI v2. That's fine, except the logic is a bit awkward, because we have to check that _CALL_ELF is defined and then what its value is. So check it once in asm/types.h and define PPC64_ELF_ABI_v2 when ELF ABI v2 is detected. We also have a few places where what we're really trying to check is that we are using the 64-bit v1 ABI, ie. function descriptors. So also add a #define for that, which simplifies several checks. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-14 13:58:27 +10:00
Michael Ellerman	027dfac694	powerpc: Various typo fixes Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-14 13:58:26 +10:00
Christophe Leroy	e289086f65	powerpc/32: Get rid of sub_reloc_offset() sub_reloc_offset() has not been used since commit `917f0af9e5` ("powerpc: Remove arch/ppc and include/asm-ppc") which removed include/asm-ppc/prom.h. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-14 13:58:26 +10:00
Anton Blanchard	d96f234f47	powerpc: Avoid load hit store in setup_sigcontext() In setup_sigcontext(), we set current->thread.vrsave then use it straight after. Since current is hidden from the compiler via inline assembly, it cannot optimise this and we end up with a load hit store. Fix this by using a temporary. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-14 13:58:25 +10:00
Anton Blanchard	8eb9803723	powerpc: Avoid load hit store in __giveup_fpu() and __giveup_altivec() In both __giveup_fpu() and __giveup_altivec() we make two modifications to tsk->thread.regs->msr. gcc decides to do a read/modify/write of each change, so we end up with a load hit store: ld r9,264(r10) rldicl r9,r9,50,1 rotldi r9,r9,14 std r9,264(r10) ... ld r9,264(r10) rldicl r9,r9,40,1 rotldi r9,r9,24 std r9,264(r10) Fix this by using a temporary. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-14 13:58:25 +10:00
Michael Ellerman	2c2a63e301	powerpc/pseries: Fix IBM_ARCH_VEC_NRCORES_OFFSET since POWER8NVL was added The recent commit `7cc851039d` ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") added a new PVR mask & value to the start of the ibm_architecture_vec[] array. However it missed the fact that further down in the array, we hard code the offset of one of the fields, and then at boot use that value to patch the value in the array. This means every update to the array must also update the #define, ugh. This means that on pseries machines we will misreport to firmware the number of cores we support, by a factor of threads_per_core. Fix it for now by updating the #define. Fixes: `7cc851039d` ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-08 10:40:05 +10:00
Khem Raj	1e407ee3b2	powerpc/ptrace: Fix out of bounds array access warning gcc-6 correctly warns about a out of bounds access arch/powerpc/kernel/ptrace.c:407:24: warning: index 32 denotes an offset greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' [-Warray-bounds] offsetof(struct thread_fp_state, fpr[32][0])); ^ check the end of array instead of beginning of next element to fix this Signed-off-by: Khem Raj <raj.khem@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Segher Boessenkool <segher@kernel.crashing.org> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-06 10:48:07 +10:00
Arnd Bergmann	169047f447	rtc: powerpc: provide rtc_class_ops directly The rtc-generic driver provides an architecture specific wrapper on top of the generic rtc_class_ops abstraction, and powerpc has another abstraction on top, which is a bit silly. This changes the powerpc rtc-generic device to provide its rtc_class_ops directly, to reduce the number of layers by one. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>	2016-06-04 00:23:34 +02:00
Geliang Tang	8cfc8ddc99	pstore: add lzo/lz4 compression support Like zlib compression in pstore, this patch added lzo and lz4 compression support so that users can have more options and better compression ratio. The original code treats the compressed data together with the uncompressed ECC correction notice by using zlib decompress. The ECC correction notice is missing in the decompression process. The treatment also makes lzo and lz4 not working. So I treat them separately by using pstore_decompress() to treat the compressed data, and memcpy() to treat the uncompressed ECC correction notice. Signed-off-by: Geliang Tang <geliangtang@163.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2016-06-02 10:59:31 -07:00
Thomas Huth	7cc851039d	powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call If we do not provide the PVR for POWER8NVL, a guest on this system currently ends up in PowerISA 2.06 compatibility mode on KVM, since QEMU does not provide a generic PowerISA 2.07 mode yet. So some new instructions from POWER8 (like "mtvsrd") get disabled for the guest, resulting in crashes when using code compiled explicitly for POWER8 (e.g. with the "-mcpu=power8" option of GCC). Fixes: `ddee09c099` ("powerpc: Add PVR for POWER8NVL processor") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-01 13:47:34 +10:00
Horia Geantă	d54fc90cc9	powerpc: add io{read,write}64 accessors This will allow device drivers to consistently use io{read,write}XX also for 64-bit accesses. Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-05-31 16:41:52 +08:00
Michal Hocko	6904817607	vdso: make arch_setup_additional_pages wait for mmap_sem for write killable most architectures are relying on mmap_sem for write in their arch_setup_additional_pages. If the waiting task gets killed by the oom killer it would block oom_reaper from asynchronous address space reclaim and reduce the chances of timely OOM resolving. Wait for the lock in the killable mode and return with EINTR if the task got killed while waiting. Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Andy Lutomirski <luto@amacapital.net> [x86 vdso] Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-23 17:04:14 -07:00
Jiri Slaby	5f56a5dfdb	exit_thread: remove empty bodies Define HAVE_EXIT_THREAD for archs which want to do something in exit_thread. For others, let's define exit_thread as an empty inline. This is a cleanup before we change the prototype of exit_thread to accept a task parameter. [akpm@linux-foundation.org: fix mips] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-20 17:58:30 -07:00
Linus Torvalds	c04a588029	powerpc updates for 4.7 Highlights: - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V - Live patching support for ppc64le (also merged via livepatching.git) Various cleanups & minor fixes from: - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin Rothberg, Vipin K Parashar. General: - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot - Fix branching to OOL handlers in relocatable kernel from Hari Bathini - Add support for userspace Power9 copy/paste from Chris Smart - Always use STRICT_MM_TYPECHECKS from Michael Ellerman - Add mask of possible MMU features from Michael Ellerman PCI: - Enable pass through of NVLink to guests from Alexey Kardashevskiy - Cleanups in preparation for powernv PCI hotplug from Gavin Shan - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G. Piccoli - Remove the dependency on EEH struct in DDW mechanism from Guilherme G. Piccoli selftests: - Test cp_abort during context switch from Chris Smart - Add several tests for transactional memory support from Rashmica Gupta perf: - Add support for sampling interrupt register state from Anju T - Add support for unwinding perf-stackdump from Chandan Kumar cxl: - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud - Allow initialization on timebase sync failures from Frederic Barrat - Increase timeout for detection of AFU mmio hang from Frederic Barrat - Handle num_of_processes larger than can fit in the SPA from Ian Munsie - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie - Check periodically the coherent platform function's state from Christophe Lombard Freescale: - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum workaround, and a kconfig dependency fix." -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXPsGzAAoJEFHr6jzI4aWAVoAP/iKdrDe0eYHlVAE9SqnbsiZs lgDxdsC8P3fsmP1G9o/HkKhC82zHl/La8Ztz8dtqa+LkSzbfliWP1ztJsI7GsBFo tyCKzWnX9Rwvd3meHu/o/SQ29TNLm/PbPyyRqpj5QPbJ8XCXkAXR7ZZZqjvcMsJW /AgIr7Cgf53tl9oZzzl/c7CnNHhMq+NBdA71vhWtUx+T97wfJEGyKW6HhZyHDbEU iAki7fu77ZpEqC/Fh9swf0dCGBJ+a132NoMVo0AdV7EQLznUYlQpQEqa+1PyHZOP /ArOzf2mDg6m3PfCo1eiB07v8PnVZ3llEUbVAJNg3GUxbE4SHrqq/kwm0iElm3p/ DvFxerCwdX9vmskJX4wDs+pSZRabXYj9XVMptsgFzA4joWrqqb7mBHqaort88YcY YSljEt1bHyXmiJ+dBya40qARsWUkCVN7ZgEzdxckq0KI3w7g2tqpqIbO2lClWT6t B3GpqQ4jp34+d1M14FB91fIGK7tMvOhSInE0Mv9+tPvRsepXqiiU/SwdAtRlr3m2 zs/K+4FYcVjJ3Rmpgc+tI38PbZxHe212I35YN6L1LP+4ZfAtzz0NyKdooTIBtkbO 19pX4WbBjKq8zK+YutrySncBIrbnI6VjW51vtRhgVKZliPFO/6zKagyU6FbxM+E5 udQES+t3F/9gvtxgxtDe =YvyQ -----END PGP SIGNATURE----- Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V - Live patching support for ppc64le (also merged via livepatching.git) Various cleanups & minor fixes from: - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin Rothberg, Vipin K Parashar. General: - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot - Fix branching to OOL handlers in relocatable kernel from Hari Bathini - Add support for userspace Power9 copy/paste from Chris Smart - Always use STRICT_MM_TYPECHECKS from Michael Ellerman - Add mask of possible MMU features from Michael Ellerman PCI: - Enable pass through of NVLink to guests from Alexey Kardashevskiy - Cleanups in preparation for powernv PCI hotplug from Gavin Shan - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G Piccoli - Remove the dependency on EEH struct in DDW mechanism from Guilherme G Piccoli selftests: - Test cp_abort during context switch from Chris Smart - Add several tests for transactional memory support from Rashmica Gupta perf: - Add support for sampling interrupt register state from Anju T - Add support for unwinding perf-stackdump from Chandan Kumar cxl: - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud - Allow initialization on timebase sync failures from Frederic Barrat - Increase timeout for detection of AFU mmio hang from Frederic Barrat - Handle num_of_processes larger than can fit in the SPA from Ian Munsie - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie - Check periodically the coherent platform function's state from Christophe Lombard Freescale: - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum workaround, and a kconfig dependency fix." * tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits) powerpc/86xx: Fix PCI interrupt map definition powerpc/86xx: Move pci1 definition to the include file powerpc/fsl: Fix build of the dtb embedded kernel images powerpc/fsl: Fix rcpm compatible string powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC powerpc/fsl-pci: Add a workaround for PCI 5 errata powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb powerpc/powernv/npu: Add PE to PHB's list powerpc/powernv: Fix insufficient memory allocation powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover() powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()" powerpc/powernv/npu: Enable NVLink pass through powerpc/powernv/npu: Rework TCE Kill handling powerpc/powernv/npu: Add set/unset window helpers powerpc/powernv/ioda2: Export debug helper pe_level_printk() ...	2016-05-20 10:12:41 -07:00
Linus Torvalds	0b86c75db6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching updates from Jiri Kosina: - remove of our own implementation of architecture-specific relocation code and leveraging existing code in the module loader to perform arch-dependent work, from Jessica Yu. The relevant patches have been acked by Rusty (for module.c) and Heiko (for s390). - live patching support for ppc64le, which is a joint work of Michael Ellerman and Torsten Duwe. This is coming from topic branch that is share between livepatching.git and ppc tree. - addition of livepatching documentation from Petr Mladek * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: make object/func-walking helpers more robust livepatch: Add some basic livepatch documentation powerpc/livepatch: Add live patching support on ppc64le powerpc/livepatch: Add livepatch stack to struct thread_info powerpc/livepatch: Add livepatch header livepatch: Allow architectures to specify an alternate ftrace location ftrace: Make ftrace_location_range() global livepatch: robustify klp_register_patch() API error checking Documentation: livepatch: outline Elf format and requirements for patch modules livepatch: reuse module loader code to write relocations module: s390: keep mod_arch_specific for livepatch modules module: preserve Elf information for livepatch modules Elf: add livepatch-specific Elf constants	2016-05-17 17:11:27 -07:00
Linus Torvalds	16bf834805	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits) gitignore: fix wording mfd: ab8500-debugfs: fix "between" in printk memstick: trivial fix of spelling mistake on management cpupowerutils: bench: fix "average" treewide: Fix typos in printk IB/mlx4: printk fix pinctrl: sirf/atlas7: fix printk spelling serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/ w1: comment spelling s/minmum/minimum/ Blackfin: comment spelling s/divsor/divisor/ metag: Fix misspellings in comments. ia64: Fix misspellings in comments. hexagon: Fix misspellings in comments. tools/perf: Fix misspellings in comments. cris: Fix misspellings in comments. c6x: Fix misspellings in comments. blackfin: Fix misspelling of 'register' in comment. avr32: Fix misspelling of 'definitions' in comment. treewide: Fix typos in printk Doc: treewide : Fix typos in DocBook/filesystem.xml ...	2016-05-17 17:05:30 -07:00
Guilherme G. Piccoli	c2078d9ef6	Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" This reverts commit `89a51df5ab`. The function eeh_add_device_early() is used to perform EEH initialization in devices added later on the system, like in hotplug/DLPAR scenarios. Since the commit `89a51df5ab` ("powerpc/eeh: Fix crash in eeh_add_device_early() on Cell") a new check was introduced in this function - Cell has no EEH capabilities which led to kernel oops if hotplug was performed, so checking for eeh_enabled() was introduced to avoid the issue. However, in architectures that EEH is present like pSeries or PowerNV, we might reach a case in which no PCI devices are present on boot time and so EEH is not initialized. Then, if a device is added via DLPAR for example, eeh_add_device_early() fails because eeh_enabled() is false, and EEH end up not being enabled at all. This reverts the aforementioned patch since a new verification was introduced by the commit `d91dafc02f` ("powerpc/eeh: Delay probing EEH device during hotplug") and so the original Cell issue does not happen anymore. Cc: stable@vger.kernel.org # v4.1+ Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-12 19:52:21 +10:00
Gavin Shan	d6d63d720d	powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() The label "reset" in eeh_pe_change_owner() is used only for once. No need to keep it and just drop it. No logical changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-12 19:52:20 +10:00
Gavin Shan	2efc771f24	powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover() The function eeh_pe_reset_and_recover() is used to recover EEH error when the passthrough device are transferred to guest and backwards, meaning the device's driver is vfio-pci or none. In both cases, the handlers triggered by eeh_report_reset() and eeh_report_resume() shouldn't be called. This ignores the error handlers from eeh_report_reset() and eeh_report_resume(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-12 19:52:20 +10:00
Gavin Shan	5a0cdbfd17	powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() The function eeh_pe_reset_and_recover() is used to recover EEH error when the passthrou device are transferred to guest and backwards. The content in the device's config space will be lost on PE reset issued in the middle of the recovery. The function saves/restores it before/after the reset. However, config access to some adapters like Broadcom BCM5719 at this point will causes fenced PHB. The config space is always blocked and we save 0xFF's that are restored at late point. The memory BARs are totally corrupted, causing another EEH error upon access to one of the memory BARs. This restores the config space on those adapters like BCM5719 from the content saved to the EEH device when it's populated, to resolve above issue. Fixes: `5cfb20b9` ("powerpc/eeh: Emulate EEH recovery for VFIO devices") Cc: stable@vger.kernel.org #v3.18+ Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-12 19:52:20 +10:00
Gavin Shan	affeb0f2d3	powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() The function eeh_pe_reset_and_recover() is used to recover EEH error when the passthrough device are transferred to guest and backwards, meaning the device's driver is vfio-pci or none. When the driver is vfio-pci that provides error_detected() error handler only, the handler simply stops the guest and it's not expected behaviour. On the other hand, no error handlers will be called if we don't have a bound driver. This ignores the error handler in eeh_pe_reset_and_recover() that reports the error to device driver to avoid the exceptional behaviour. Fixes: `5cfb20b9` ("powerpc/eeh: Emulate EEH recovery for VFIO devices") Cc: stable@vger.kernel.org #v3.18+ Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-12 19:52:20 +10:00
Gavin Shan	4a5954ed77	powerpc/pci: Don't scan empty slot In hotplug case, function pci_add_pci_devices() is called to rescan the specified PCI bus, which might not have any child devices. Access to the PCI bus's child device node will cause kernel crash without exception. This adds one more check to skip scanning PCI bus that doesn't have any subordinate devices from device-tree, in order to avoid kernel crash. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:26 +10:00
Gavin Shan	cdddc577d9	powerpc/pci: Export pci_traverse_device_nodes() This renames traverse_pci_devices() to pci_traverse_device_nodes(). The function traverses all subordinate device nodes of the specified one. Also, below cleanup applied to the function. No logical changes introduced. * Rename "pre" to "fn". * Avoid assignment in if condition reported from checkpatch.pl. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:25 +10:00
Gavin Shan	de5a28ac5a	powerpc/pci: Introduce pci_remove_device_node_info() This implements and exports pci_remove_device_node_info(). It's used to remove the pdn (struct pci_dn) for the indicated device node. The function is going to be used by PowerNV PCI hotplug driver. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:25 +10:00
Gavin Shan	d8f66f411e	powerpc/pci: Export pci_add_device_node_info() This renames update_dn_pci_info() to pci_add_device_node_info() with corresponding adjustment on the parameter type and exports it. The function is used to create pdn (struct pci_dn) for the indicated device node. Another function add_pdn(), almost wrapper of pci_add_device_node_info(), to be used in traverse_pci_devices(). No logical changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:24 +10:00
Gavin Shan	6384d97780	powerpc/pci: Move pci_find_bus_by_node() around This moves pci_find_bus_by_node() from arch/powerpc/platforms/ pseries/pci_dlpar.c to arch/powerpc/kernel/pci-hotplug.c so that the function can be used by pSeries and PowerNV platform at the same time. Also, below cleanup applied. No functional changes introduced. * Remove variable "busdn" in find_bus_among_children() * Use PCI_DN() to convert device node to pci_dn Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:24 +10:00
Gavin Shan	bd251b893d	powerpc/pci: Rename pcibios_{add, remove}_pci_devices() This renames pcibios_{add,remove}_pci_devices() to avoid conflicts with names of the weak functions in PCI subsystem, which have the prefix "pcibios". No logical changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-By: Alistair Popple <alistair@popple.id.au> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:23 +10:00
Mahesh Salgaonkar	2513767d22	powerpc/powernv: Rename machine_check_pSeries_early() to powernv The routine machine_check_pSeries_early() is only used on powernv, not pseries. Hence rename machine_check_pSeries_early() to machine_check_powernv_early(). Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:11 +10:00
Suraj Jitindar Singh	925e2d1ded	powerpc: Update of_remove_property() call sites to remove null checking After obtaining a property from of_find_property() and before calling of_remove_property() most code checks to ensure that the property returned from of_find_property() is not null. The previous patch moved this check to the start of the function of_remove_property() in order to avoid the case where this check isn't done and a null value is passed. This ensures the check is always conducted before taking locks and attempting to remove the property. Thus it is no longer necessary to perform a check for null values before invoking of_remove_property(). Update of_remove_property() call sites in order to remove redundant checking for null property value as check is now performed within the of_remove_property function(). Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> [mpe: Unbreak some lines which are just >80 chars for readability] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:04 +10:00
Chris Smart	e44c1b15cf	powerpc: Remove unnecessary CONFIG_SMP #ifdefs The code in machine_restart/power_off/halt() includes #ifdefs around calls to smp_send_stop(), however these are not required as include/linux/smp.h includes an empty version of this function for CONFIG_SMP=n builds. Signed-off-by: Chris Smart <chris@distroguy.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:01 +10:00
Rashmica Gupta	c415c9cb8a	powerpc: Remove unused remnants from A2 cpu Support for the A2 cpu was removed in commit `fb5a515704` ("powerpc: Remove platforms/wsp and associated pieces"), and the externs: __setup_cpu_a2 and __restore_cpu_a2 are still around and unused, so remove them. Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:00 +10:00
Valentin Rothberg	bb03efe2b7	powerpc/mm/radix: Fix CONFIG_PPC_MMU_STD_64 typo It's CONFIG_PPC_STD_MMU_64 not ... CONFIG_PPC_MMU_STD_64. Fixes: 11ffc1cfa4c2 ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code") Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:53:59 +10:00
Aneesh Kumar K.V	17a3dd2f5f	powerpc/mm/radix: Use firmware feature to enable Radix MMU We use the existing "ibm,pa-features" device-tree property to enable Radix MMU mode. This means we default to hash mode unless firmware tells us it's OK to start using Radix mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:53:58 +10:00
Aneesh Kumar K.V	43a5c68427	powerpc/mm/radix: Make sure swapper pgdir is properly aligned With 4K page size radix config our level 1 page table size is 64K and it should be naturally aligned. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:53:55 +10:00
Aneesh Kumar K.V	d6a9996e84	powerpc/mm: vmalloc abstraction in preparation for radix The vmalloc range differs between hash and radix config. Hence make VMALLOC_START and related constants a variable which will be runtime initialized depending on whether hash or radix mode is active. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Fix missing init of ioremap_bot in pgtable_64.c for ppc64e] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:53:53 +10:00
Aneesh Kumar K.V	caca285e5a	powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code We also use MMU_FTR_RADIX to branch out from code path specific to hash. No functionality change. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:53:45 +10:00
Peter Zijlstra (Intel)	e9d867a67f	sched: Allow per-cpu kernel threads to run on online && !active In order to enable symmetric hotplug, we must mirror the online && !active state of cpu-down on the cpu-up side. However, to retain sanity, limit this state to per-cpu kthreads. Aside from the change to set_cpus_allowed_ptr(), which allow moving the per-cpu kthreads on, the other critical piece is the cpu selection for pinned tasks in select_task_rq(). This avoids dropping into select_fallback_rq(). select_fallback_rq() cannot be allowed to select !active cpus because its used to migrate user tasks away. And we do not want to move user tasks onto cpus that are in transition. Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160301152303.GV6356@twins.programming.kicks-ass.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-05-06 14:58:22 +02:00
Aneesh Kumar K.V	1a472c9dba	powerpc/mm/radix: Add tlbflush routines Core kernel doesn't track the page size of the VA range that we are invalidating. Hence we end up flushing TLB for the entire mm here. Later patches will improve this. We also don't flush page walk cache separetly instead use RIC=2 when flushing TLB, because we do a MMU gather flush after freeing page table. MMU_NO_CONTEXT is updated for hash. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-01 18:33:09 +10:00
Aneesh Kumar K.V	d2adba3fd1	powerpc/mm: Abstraction for switch_mmu_context() How we switch MMU context differs between hash and radix. For hash we need to switch the SLB details and for radix we need to switch the PID SPR. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-01 18:33:04 +10:00
Aneesh Kumar K.V	dd1842a2a4	powerpc/mm: Make page table size a variable Radix and hash MMU models support different page table sizes. Make the #defines a variable so that existing code can work with variable sizes. Slice related code is only used by hash, so use hash constants there. We will replicate some of the boundary conditions with resepct to TASK_SIZE using radix values too. Right now we do boundary condition check using hash constants. Swapper pgdir size is initialized in asm code. We select the max pgd size to keep it simple. For now we select hash pgdir. When adding radix we will switch that to radix pgdir which is 64K. BUILD_BUG_ON check which is removed is already done in hugepage_init() using MAYBE_BUILD_BUG_ON(). Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-01 18:32:48 +10:00
Aneesh Kumar K.V	72176dd0ad	powerpc/mm: Use a helper for finding pte bits mapping I/O area Use a helper instead of open coding with constants. A later patch will drop the WIMG bits and use PowerISA 3.0 defines. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-01 18:32:32 +10:00
Masanari Iida	c01e01597c	treewide: Fix typos in printk This patch fix spelling typos in printk from various part of the codes. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2016-04-28 10:52:28 +02:00
Thiago Jung Bauermann	7132e2d669	ftrace: Match dot symbols when searching functions on ppc64 In the ppc64 big endian ABI, function symbols point to function descriptors. The symbols which point to the function entry points have a dot in front of the function name. Consequently, when the ftrace filter mechanism searches for the symbol corresponding to an entry point address, it gets the dot symbol. As a result, ftrace filter users have to be aware of this ABI detail on ppc64 and prepend a dot to the function name when setting the filter. The perf probe command insulates the user from this by ignoring the dot in front of the symbol name when matching function names to symbols, but the sysfs interface does not. This patch makes the ftrace filter mechanism do the same when searching symbols. Fixes the following failure in ftracetest's kprobe_ftrace.tc: .../kprobe_ftrace.tc: line 9: echo: write error: Invalid argument That failure is on this line of kprobe_ftrace.tc: echo _do_fork > set_ftrace_filter This is because there's no _do_fork entry in the functions list: # cat available_filter_functions \| grep _do_fork ._do_fork This change introduces no regressions on the perf and ftracetest testsuite results. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-27 09:47:29 +10:00
Chris Smart	8a649045e7	powerpc: Add support for userspace P9 copy paste The copy paste facility introduced in POWER9 provides an optimised mechanism for a userspace application to copy a cacheline. This is provided by a pair of instructions, copy and paste, while a third, cp_abort (copy paste abort), provides a clean up of the state in case of a failure. The copy instruction will read a 128 byte cacheline and store it in an internal buffer. The subsequent paste instruction will store this internal buffer to memory and set a CR field if the paste succeeds. Since the state of the copy paste buffer is internal (and not architecturally visible), in the unlikely event of a context switch, the state cannot be stored and the paste should therefore fail. The cp_abort instruction exists to fail and clean up any such interrupted copy paste sequence and is to be called by the kernel as part of the context switch. Doing so prevents data from a preceding copy in one process leaking into the paste of another. This code enables use of the cp_abort instruction if a supported processor is detected. NOTE: this is for userspace only, not in kernel, and does not deal with KVM guests. Patch created with much assistance from Michael Neuling <mikey@neuling.org> Signed-off-by: Chris Smart <chris@distroguy.com> Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-27 09:28:07 +10:00
Andrew Donnellan	2d5217840f	powerpc/eeh: fix misleading indentation Found by smatch. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-27 09:19:37 +10:00
Hari Bathini	057b6d7e62	powerpc/book3s64: Remove __end_handlers marker The __end_handlers marker was intended to mark down upto code that gets called from exception prologs. But that hasn't kept pace with code changes. Case in point, slb_miss_realmode being called from exception prolog code but isn't below __end_handlers marker. So, __end_handlers marker is as good as a comment but could be misleading at times if it isn't in sync with the code, as is the case now. So, let us avoid this confusion by having a better comment and removing __end_handlers marker altogether. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-21 23:32:58 +10:00
Hari Bathini	8ed8ab4004	powerpc/book3s64: Fix branching to OOL handlers in relocatable kernel Some of the interrupt vectors on 64-bit POWER server processors are only 32 bytes long (8 instructions), which is not enough for the full first-level interrupt handler. For these we need to branch to an out-of-line (OOL) handler. But when we are running a relocatable kernel, interrupt vectors till __end_interrupts marker are copied down to real address 0x100. So, branching to labels (ie. OOL handlers) outside this section must be handled differently (see LOAD_HANDLER()), considering relocatable kernel, which would need at least 4 instructions. However, branching from interrupt vector means that we corrupt the CFAR (come-from address register) on POWER7 and later processors as mentioned in commit `1707dd16`. So, EXCEPTION_PROLOG_0 (6 instructions) that contains the part up to the point where the CFAR is saved in the PACA should be part of the short interrupt vectors before we branch out to OOL handlers. But as mentioned already, there are interrupt vectors on 64-bit POWER server processors that are only 32 bytes long (like vectors 0x4f00, 0x4f20, etc.), which cannot accomodate the above two cases at the same time owing to space constraint. Currently, in these interrupt vectors, we simply branch out to OOL handlers, without using LOAD_HANDLER(), which leaves us vulnerable when running a relocatable kernel (eg. kdump case). While this has been the case for sometime now and kdump is used widely, we were fortunate not to see any problems so far, for three reasons: 1. In almost all cases, production kernel (relocatable) is used for kdump as well, which would mean that crashed kernel's OOL handler would be at the same place where we end up branching to, from short interrupt vector of kdump kernel. 2. Also, OOL handler was unlikely the reason for crash in almost all the kdump scenarios, which meant we had a sane OOL handler from crashed kernel that we branched to. 3. On most 64-bit POWER server processors, page size is large enough that marking interrupt vector code as executable (see commit `429d2e83`) leads to marking OOL handler code from crashed kernel, that sits right below interrupt vector code from kdump kernel, as executable as well. Let us fix this by moving the __end_interrupts marker down past OOL handlers to make sure that we also copy OOL handlers to real address 0x100 when running a relocatable kernel. This fix has been tested successfully in kdump scenario, on an LPAR with 4K page size by using different default/production kernel and kdump kernel. Also tested by manually corrupting the OOL handlers in the first kernel and then kdump'ing, and then causing the OOL handlers to fire - mpe. Fixes: `c1fb6816fb` ("powerpc: Add relocation on exception vector handlers") Cc: stable@vger.kernel.org Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-21 23:32:44 +10:00
Michael Ellerman	8404410b29	Merge branch 'topic/livepatch' into next Merge the support for live patching on ppc64le using mprofile-kernel. This branch has also been merged into the livepatching tree for v4.7.	2016-04-18 20:45:32 +10:00
Anton Blanchard	4705e02498	powerpc: Update TM user feature bits in scan_features() We need to update the user TM feature bits (PPC_FEATURE2_HTM and PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature bit. At the moment, if firmware reports TM is not available we turn off the kernel TM feature bit but leave the userspace ones on. Userspace thinks it can execute TM instructions and it dies trying. This (together with a QEMU patch) fixes PR KVM, which doesn't currently support TM. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-18 20:10:45 +10:00
Anton Blanchard	beff82374b	powerpc: Update cpu_user_features2 in scan_features() scan_features() updates cpu_user_features but not cpu_user_features2. Amongst other things, cpu_user_features2 contains the user TM feature bits which we must keep in sync with the kernel TM feature bit. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-18 20:10:45 +10:00
Anton Blanchard	6997e57d69	powerpc: scan_features() updates incorrect bits for REAL_LE The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU feature value, meaning all the remaining elements initialise the wrong values. This means instead of checking for byte 5, bit 0, we check for byte 0, bit 0, and then we incorrectly set the CPU feature bit as well as MMU feature bit 1 and CPU user feature bits 0 and 2 (5). Checking byte 0 bit 0 (IBM numbering), means we're looking at the "Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU. In practice that bit is set on all platforms which have the property. This means we set CPU_FTR_REAL_LE always. In practice that seems not to matter because all the modern cpus which have this property also implement REAL_LE, and we've never needed to disable it. We're also incorrectly setting MMU feature bit 1, which is: #define MMU_FTR_TYPE_8xx 0x00000002 Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E code, which can't run on the same cpus as scan_features(). So this also doesn't matter in practice. Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2 is not currently used, and bit 0 is: #define PPC_FEATURE_PPC_LE 0x00000001 Which says the CPU supports the old style "PPC Little Endian" mode. Again this should be harmless in practice as no 64-bit CPUs implement that mode. Fix the code by adding the missing initialisation of the MMU feature. Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It would be unsafe to start using it as old kernels incorrectly set it. Fixes: `44ae3ab335` ("powerpc: Free up some CPU feature bits by moving out MMU-related features") Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org [mpe: Flesh out changelog, add comment reserving 0x4] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-18 20:08:38 +10:00
Jiri Kosina	4d4fb97a62	Merge branch 'topic/livepatch' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into for-4.7/livepatching-ppc64le Pull livepatching support for ppc64 architecture from Michael Ellerman. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2016-04-15 11:42:51 +02:00
Michael Ellerman	85baa09549	powerpc/livepatch: Add live patching support on ppc64le Add the kconfig logic & assembly support for handling live patched functions. This depends on DYNAMIC_FTRACE_WITH_REGS, which in turn depends on the new -mprofile-kernel ftrace ABI, which is only supported currently on ppc64le. Live patching is handled by a special ftrace handler. This means it runs from ftrace_caller(). The live patch handler modifies the NIP so as to redirect the return from ftrace_caller() to the new patched function. However there is one particularly tricky case we need to handle. If a function A calls another function B, and it is known at link time that they share the same TOC, then A will not save or restore its TOC, and will call the local entry point of B. When we live patch B, we replace it with a new function C, which may not have the same TOC as A. At live patch time it's too late to modify A to do the TOC save/restore, so the live patching code must interpose itself between A and C, and do the TOC save/restore that A omitted. An additionaly complication is that the livepatch code can not create a stack frame in order to save the TOC. That is because if C takes > 8 arguments, or is varargs, A will have written the arguments for C in A's stack frame. To solve this, we introduce a "livepatch stack" which grows upward from the base of the regular stack, and is used to store the TOC & LR when calling a live patched function. When the patched function returns, we retrieve the real LR & TOC from the livepatch stack, restore them, and pop the livepatch "stack frame". Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Torsten Duwe <duwe@suse.de> Reviewed-by: Balbir Singh <bsingharora@gmail.com>	2016-04-14 15:48:06 +10:00
Michael Ellerman	5d31a96e6c	powerpc/livepatch: Add livepatch stack to struct thread_info In order to support live patching we need to maintain an alternate stack of TOC & LR values. We use the base of the stack for this, and store the "live patch stack pointer" in struct thread_info. Unlike the other fields of thread_info, we can not statically initialise that value, so it must be done at run time. This patch just adds the code to support that, it is not enabled until the next patch which actually adds live patch support. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com>	2016-04-14 15:47:06 +10:00
Daniel Axtens	7f92bc5694	powerpc: sparse: Include headers for __weak symbols Sometimes when sparse warns about undefined symbols, it isn't because they should have 'static' added, it's because they're overriding __weak symbols defined elsewhere, and the header has been missed. Fix a few of them by adding appropriate headers. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-12 21:05:19 +10:00
Daniel Axtens	635218c785	powerpc: sparse: static-ify some things As sparse suggests, these should be made static. Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-12 21:05:18 +10:00
Paul Gortmaker	c0c523897d	powerpc: make kernel/nvram_64.c explicitly non-modular The Makefile/Kconfig currently controlling compilation of this code is: obj-$(CONFIG_PPC64) += setup_64.o sys_ppc32.o \ signal_64.o ptrace32.o \ paca.o nvram_64.o firmware.o arch/powerpc/platforms/Kconfig.cputype:config PPC64 arch/powerpc/platforms/Kconfig.cputype: bool "64-bit kernel" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We don't replace module.h with init.h since the file already has that. We delete the MODULE_LICENSE tag since that information is already contained at the top of the file in the comments. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Andrzej Hajda <a.hajda@samsung.com> Cc: Anton Blanchard <anton@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-11 20:30:43 +10:00
Russell Currey	8ee26530bb	powerpc/eeh: rename EEH from "extended" to "enhanced" error handling IBM online documentation for EEH uses "extended error handling" and "enhanced error handling" to refer to the same thing, in different places. The only place mentioning it as "enhanced error handling" in the kernel is the MAINTAINERS file, and it's "extended" in some documentation. IBM originally defined EEH as "enhanced error handling", so standardise all mentions of EEH to use that term. Signed-off-by: Russell Currey <ruscur@russell.cc> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-11 20:30:42 +10:00
Michael Ellerman	b05fac783a	powerpc: Remove orphaned asm implementation of abs() This has been unused since ~2004, remove it. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-11 20:30:41 +10:00
Michael Ellerman	1f4c66e805	powerpc/mm: Remove long disabled SLB code We have a bunch of SLB related code in the tree which is there to handle dynamic VSIDs - but currently it's all disabled at compile time. The comments say "Keep that around for when we re-implement dynamic VSIDs". But that was over 10 years ago (commit `3c726f8dee` ("[PATCH] ppc64: support 64k pages")). The chance that it would still work unchanged is minimal, and in the meantime it's confusing to folks browsing/grepping the code. If we ever want to re-instate it, it's in the git history. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com>	2016-04-11 20:30:40 +10:00
Oliver O'Halloran	01d7c2a2de	powerpc/process: Fix altivec SPR not being saved In save_sprs() in process.c contains the following test: if (cpu_has_feature(cpu_has_feature(CPU_FTR_ALTIVEC))) t->vrsave = mfspr(SPRN_VRSAVE); CPU feature with the mask 0x1 is CPU_FTR_COHERENT_ICACHE so the test is equivilent to: if (cpu_has_feature(CPU_FTR_ALTIVEC) && cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) On CPUs without support for both (i.e G5) this results in vrsave not being saved between context switches. The vector register save/restore code doesn't use VRSAVE to determine which registers to save/restore, but the value of VRSAVE is used to determine if altivec is being used in several code paths. Fixes: `152d523e63` ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") Cc: stable@vger.kernel.org Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-29 12:08:08 +11:00
Alexander Potapenko	be7635e728	arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to <linux/interrupt.h> so that the users don't need to pull in <linux/ftrace.h>. Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-25 16:37:42 -07:00
Linus Torvalds	d5e2d00898	powerpc updates for 4.6 Highlights: - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V - Add POWER9 cputable entry from Michael Neuling - FPU/Altivec/VSX save/restore optimisations from Cyril Bur - Add support for new ftrace ABI on ppc64le from Torsten Duwe Various cleanups & minor fixes from: - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh. General: - atomics: Allow architectures to define their own __atomic_op_* helpers from Boqun Feng - Implement atomic{, 64}__return_ variants and acquire/release/relaxed variants for (cmp)xchg from Boqun Feng - Add powernv_defconfig from Jeremy Kerr - Fix BUG_ON() reporting in real mode from Balbir Singh - Add xmon command to dump OPAL msglog from Andrew Donnellan - Add xmon command to dump process/task similar to ps(1) from Douglas Miller - Clean up memory hotplug failure paths from David Gibson pci/eeh: - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei Yang. - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan. - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang - PCI: Add pcibios_bus_add_device() weak function from Wei Yang - MAINTAINERS: Update EEH details and maintainership from Russell Currey cxl: - Support added to the CXL driver for running on both bare-metal and hypervisor systems, from Christophe Lombard and Frederic Barrat. - Ignore probes for virtual afu pci devices from Vaibhav Jain perf: - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu - hv-24x7: Fix usage with chip events, display change in counter values, display domain indices in sysfs, eliminate domain suffix in event names, from Sukadev Bhattiprolu Freescale: - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup." -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW69OrAAoJEFHr6jzI4aWAe5EQAJw/hE6WBQc6a7Tj70AnXOqR qk/m5pZjuTwQxfBteIvHR1pE5eXdlvtAjcD254LVkFkAbIn19W/h2k0VX/nlee7P n/VRHRifjtGmukqHrPYJJ7ua9mNlY7pxh3leGSixBFASnSWqMxNNNziNQtSTcuCs TjHiw6NkZ/kzeunA4bAfE4yHVUZjmL74oiS9JbLyaVHqoW4fqWLlh26AKo2yYMZI qPicBBG4HBi3FGvoexnKxlJNdcV4HO7LzDjJmCSfUKYCJi+Pw19T5qmhso0q0qVz vHg/A8HNeG4Hn83pNVmLeQSAIQRZ3DvTtcLgbjPo+TVwm/hzrRRBWipTeOVbkLW8 2bcOXT4t7LWUq15EAJ1LYgYZGzcLrfRfUeOcuQ1TWd3+PcfY9pE7FmizsxAAfaVe E9j9mpz4XnIqBtWkFHneTIHkQ5OWptyKuZJEaYH0nut4VsP0k8NarkseafGqBPu7 5eG83gbiQbCVixfOgblV9eocJ29JcwpjPAY4CZSGJimShg909FV7WRgZgJkKWrbK dBRco8Jcp4VglGfo2qymv7Uj4KwQoypBREOhiKUvrAsVlDxPfx+bcskhjGu9xGDC xs/+nme0/lKa/wg5K4C3mQ1GAlkMWHI0ojhJjsyODbetup5UbkEu03wjAaTdO9dT Y6ptGm0rYAJluPNlziFj =qkAt -----END PGP SIGNATURE----- Merge tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "This was delayed a day or two by some build-breakage on old toolchains which we've now fixed. There's two PCI commits both acked by Bjorn. There's one commit to mm/hugepage.c which is (co)authored by Kirill. Highlights: - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V - Add POWER9 cputable entry from Michael Neuling - FPU/Altivec/VSX save/restore optimisations from Cyril Bur - Add support for new ftrace ABI on ppc64le from Torsten Duwe Various cleanups & minor fixes from: - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh. General: - atomics: Allow architectures to define their own __atomic_op_* helpers from Boqun Feng - Implement atomic{, 64}__return_ variants and acquire/release/ relaxed variants for (cmp)xchg from Boqun Feng - Add powernv_defconfig from Jeremy Kerr - Fix BUG_ON() reporting in real mode from Balbir Singh - Add xmon command to dump OPAL msglog from Andrew Donnellan - Add xmon command to dump process/task similar to ps(1) from Douglas Miller - Clean up memory hotplug failure paths from David Gibson pci/eeh: - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei Yang. - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan. - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang - PCI: Add pcibios_bus_add_device() weak function from Wei Yang - MAINTAINERS: Update EEH details and maintainership from Russell Currey cxl: - Support added to the CXL driver for running on both bare-metal and hypervisor systems, from Christophe Lombard and Frederic Barrat. - Ignore probes for virtual afu pci devices from Vaibhav Jain perf: - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu - hv-24x7: Fix usage with chip events, display change in counter values, display domain indices in sysfs, eliminate domain suffix in event names, from Sukadev Bhattiprolu Freescale: - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup" * tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits) powerpc: Fix unrecoverable SLB miss during restore_math() powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers powerpc/rcpm: Fix build break when SMP=n powerpc/book3e-64: Use hardcoded mttmr opcode powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible powerpc/T104xRDB: add tdm riser card node to device tree powerpc32: PAGE_EXEC required for inittext powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s) powerpc/86xx: Introduce and use common dtsi powerpc/86xx: Update device tree powerpc/86xx: Move dts files to fsl directory powerpc/86xx: Switch to kconfig fragments approach powerpc/86xx: Update defconfigs powerpc/86xx: Consolidate common platform code powerpc32: Remove one insn in mulhdu powerpc32: small optimisation in flush_icache_range() powerpc: Simplify test in __dma_sync() powerpc32: move xxxxx_dcache_range() functions inline powerpc32: Remove clear_pages() and define clear_page() inline ...	2016-03-19 15:38:41 -07:00
Kees Cook	4cc7ecb7f2	param: convert some "on"/"off" users to strtobool This changes several users of manual "on"/"off" parsing to use strtobool. Some side-effects: - these uses will now parse y/n/1/0 meaningfully too - the early_param uses will now bubble up parse errors Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Amitkumar Karwar <akarwar@marvell.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Joe Perches <joe@perches.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nishant Sarmukadam <nishants@marvell.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Steve French <sfrench@samba.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Joonsoo Kim	e7df0d88c4	powerpc: query dynamic DEBUG_PAGEALLOC setting We can disable debug_pagealloc processing even if the code is compiled with CONFIG_DEBUG_PAGEALLOC. This patch changes the code to query whether it is enabled or not in runtime. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-17 15:09:34 -07:00
Linus Torvalds	10dc374766	One of the largest releases for KVM... Hardly any generic improvement, but lots of architecture-specific changes. * ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. * PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). * s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. * x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory---currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJW5r3BAAoJEL/70l94x66D2pMH/jTSWWwdTUJMctrDjPVzKzG0 yOzHW5vSLFoFlwEOY2VpslnXzn5TUVmCAfrdmFNmQcSw6hGb3K/xA/ZX/KLwWhyb oZpr123ycahga+3q/ht/dFUBCCyWeIVMdsLSFwpobEBzPL0pMgc9joLgdUC6UpWX tmN0LoCAeS7spC4TTiTTpw3gZ/L+aB0B6CXhOMjldb9q/2CsgaGyoVvKA199nk9o Ngu7ImDt7l/x1VJX4/6E/17VHuwqAdUrrnbqerB/2oJ5ixsZsHMGzxQ3sHCmvyJx WG5L00ubB1oAJAs9fBg58Y/MdiWX99XqFhdEfxq4foZEiQuCyxygVvq3JwZTxII= =OUZZ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "One of the largest releases for KVM... Hardly any generic changes, but lots of architecture-specific updates. ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory - currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits) KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch KVM: x86: disable MPX if host did not enable MPX XSAVE features arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit arm64: KVM: vgic-v3: Reset LRs at boot time arm64: KVM: vgic-v3: Do not save an LR known to be empty arm64: KVM: vgic-v3: Save maintenance interrupt state only if required arm64: KVM: vgic-v3: Avoid accessing ICH registers KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit KVM: arm/arm64: vgic-v2: Reset LRs at boot time KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers KVM: s390: allocate only one DMA page per VM KVM: s390: enable STFLE interpretation only if enabled for the guest KVM: s390: wake up when the VCPU cpu timer expires KVM: s390: step the VCPU timer while in enabled wait KVM: s390: protect VCPU cpu timer with a seqcount KVM: s390: step VCPU cpu timer during kvm_run ioctl ...	2016-03-16 09:55:35 -07:00
Cyril Bur	6e669f085d	powerpc: Fix unrecoverable SLB miss during restore_math() Commit `70fe3d9` "powerpc: Restore FPU/VEC/VSX if previously used" introduces a call to restore_math() late in the syscall return path, after MSR_RI has been cleared. The MSR_RI flag is used to indicate whether the kernel can take another exception or not. A cleared MSR_RI flag indicates that the kernel cannot. Unfortunately when a machine is under SLB pressure an SLB miss can occur in restore_math() which (with MSR_RI cleared) leads to an unrecoverable exception. Unrecoverable exception 4100 at c0000000000088d8 cpu 0x0: Vector: 4100 at [c0000003fa473b20] pc: c0000000000088d8: .load_vr_state+0x70/0x110 lr: c00000000000f710: .restore_math+0x130/0x188 sp: c0000003fa473da0 msr: 9000000002003030 current = 0xc0000007f876f180 paca = 0xc00000000fff0000 softe: 0 irq_happened: 0x01 pid = 1944, comm = K08umountfs [link register ] c00000000000f710 .restore_math+0x130/0x188 [c0000003fa473da0] c0000003fa473e30 (unreliable) [c0000003fa473e30] c000000000007b6c system_call+0x84/0xfc The clearing of MSR_RI is actually an optimisation to avoid multiple MSR writes, what must be disabled are interrupts. See comment in entry_64.S: /* * For performance reasons we clear RI the same time that we * clear EE. We only need to clear RI just before we restore r13 * below, but batching it with EE saves us one expensive mtmsrd call. * We have to be careful to restore RI if we branch anywhere from * here (eg syscall_exit_work). */ At the point of calling restore_math() r13 has not been restored, as such, the quick fix of turning MSR_RI back on for the call to restore_math() will eliminate the occurrence of an unrecoverable exception. We'd like to do a better fix in future. Fixes: `70fe3d980f` ("powerpc: Restore FPU/VEC/VSX if previously used") Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-16 15:23:02 +11:00
Scott Wood	7a25d91214	powerpc/book3e-64: Use hardcoded mttmr opcode This preserves the ability to build using older binutils (reportedly <= 2.22). Fixes: `6becef7ea0` ("powerpc/mpc85xx: Add CPU hotplug support for E6500") Signed-off-by: Scott Wood <oss@buserror.net> Cc: chenhui.zhao@freescale.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-16 15:22:16 +11:00
Linus Torvalds	710d60cbf1	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cpu hotplug updates from Thomas Gleixner: "This is the first part of the ongoing cpu hotplug rework: - Initial implementation of the state machine - Runs all online and prepare down callbacks on the plugged cpu and not on some random processor - Replaces busy loop waiting with completions - Adds tracepoints so the states can be followed" More detailed commentary on this work from an earlier email: "What's wrong with the current cpu hotplug infrastructure? - Asymmetry The hotplug notifier mechanism is asymmetric versus the bringup and teardown. This is mostly caused by the notifier mechanism. - Largely undocumented dependencies While some notifiers use explicitely defined notifier priorities, we have quite some notifiers which use numerical priorities to express dependencies without any documentation why. - Control processor driven Most of the bringup/teardown of a cpu is driven by a control processor. While it is understandable, that preperatory steps, like idle thread creation, memory allocation for and initialization of essential facilities needs to be done before a cpu can boot, there is no reason why everything else must run on a control processor. Before this patch series, bringup looks like this: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring the rest up - All or nothing approach There is no way to do partial bringups. That's something which is really desired because we waste e.g. at boot substantial amount of time just busy waiting that the cpu comes to life. That's stupid as we could very well do preparatory steps and the initial IPI for other cpus and then go back and do the necessary low level synchronization with the freshly booted cpu. - Minimal debuggability Due to the notifier based design, it's impossible to switch between two stages of the bringup/teardown back and forth in order to test the correctness. So in many hotplug notifiers the cancel mechanisms are either not existant or completely untested. - Notifier [un]registering is tedious To [un]register notifiers we need to protect against hotplug at every callsite. There is no mechanism that bringup/teardown callbacks are issued on the online cpus, so every caller needs to do it itself. That also includes error rollback. What's the new design? The base of the new design is a symmetric state machine, where both the control processor and the booting/dying cpu execute a well defined set of states. Each state is symmetric in the end, except for some well defined exceptions, and the bringup/teardown can be stopped and reversed at almost all states. So the bringup of a cpu will look like this in the future: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu bring itself up The synchronization step does not require the control cpu to wait. That mechanism can be done asynchronously via a worker or some other mechanism. The teardown can be made very similar, so that the dying cpu cleans up and brings itself down. Cleanups which need to be done after the cpu is gone, can be scheduled asynchronously as well. There is a long way to this, as we need to refactor the notion when a cpu is available. Today we set the cpu online right after it comes out of the low level bringup, which is not really correct. The proper mechanism is to set it to available, i.e. cpu local threads, like softirqd, hotplug thread etc. can be scheduled on that cpu, and once it finished all booting steps, it's set to online, so general workloads can be scheduled on it. The reverse happens on teardown. First thing to do is to forbid scheduling of general workloads, then teardown all the per cpu resources and finally shut it off completely. This patch series implements the basic infrastructure for this at the core level. This includes the following: - Basic state machine implementation with well defined states, so ordering and prioritization can be expressed. - Interfaces to [un]register state callbacks This invokes the bringup/teardown callback on all online cpus with the proper protection in place and [un]installs the callbacks in the state machine array. For callbacks which have no particular ordering requirement we have a dynamic state space, so that drivers don't have to register an explicit hotplug state. If a callback fails, the code automatically does a rollback to the previous state. - Sysfs interface to drive the state machine to a particular step. This is only partially functional today. Full functionality and therefor testability will be achieved once we converted all existing hotplug notifiers over to the new scheme. - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying processor: Control CPU Booting CPU do preparatory steps kick cpu into life do low level init sync with booting cpu sync with control cpu wait for boot bring itself up Signal completion to control cpu In a previous step of this work we've done a full tree mechanical conversion of all hotplug notifiers to the new scheme. The balance is a net removal of about 4000 lines of code. This is not included in this series, as we decided to take a different approach. Instead of mechanically converting everything over, we will do a proper overhaul of the usage sites one by one so they nicely fit into the symmetric callback scheme. I decided to do that after I looked at the ugliness of some of the converted sites and figured out that their hotplug mechanism is completely buggered anyway. So there is no point to do a mechanical conversion first as we need to go through the usage sites one by one again in order to achieve a full symmetric and testable behaviour" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) cpu/hotplug: Document states better cpu/hotplug: Fix smpboot thread ordering cpu/hotplug: Remove redundant state check cpu/hotplug: Plug death reporting race rcu: Make CPU_DYING_IDLE an explicit call cpu/hotplug: Make wait for dead cpu completion based cpu/hotplug: Let upcoming cpu bring itself fully up arch/hotplug: Call into idle with a proper state cpu/hotplug: Move online calls to hotplugged cpu cpu/hotplug: Create hotplug threads cpu/hotplug: Split out the state walk into functions cpu/hotplug: Unpark smpboot threads from the state machine cpu/hotplug: Move scheduler cpu_online notifier to hotplug core cpu/hotplug: Implement setup/removal interface cpu/hotplug: Make target state writeable cpu/hotplug: Add sysfs state interface cpu/hotplug: Hand in target state to _cpu_up/down cpu/hotplug: Convert the hotplugged cpu work to a state machine cpu/hotplug: Convert to a state machine for the control processor cpu/hotplug: Add tracepoints ...	2016-03-15 13:50:29 -07:00
Michael Ellerman	a1b5344620	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup."	2016-03-14 20:05:14 +11:00
Christophe Leroy	737b01fca3	powerpc32: Remove one insn in mulhdu Remove one instruction in mulhdu Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:20:12 -06:00
Christophe Leroy	716fa91d19	powerpc32: small optimisation in flush_icache_range() Inlining of _dcache_range() functions has shown that the compiler does the same thing a bit better with one insn less Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:20:12 -06:00
Christophe Leroy	affe587bac	powerpc32: move xxxxx_dcache_range() functions inline flush/clean/invalidate _dcache_range() functions are all very similar and are quite short. They are mainly used in __dma_sync() perf_event locate them in the top 3 consumming functions during heavy ethernet activity They are good candidate for inlining, as __dma_sync() does almost nothing but calling them Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:20:12 -06:00
Christophe Leroy	5736f96d12	powerpc32: Remove clear_pages() and define clear_page() inline clear_pages() is never used expect by clear_page, and PPC32 is the only architecture (still) having this function. Neither PPC64 nor any other architecture has it. This patch removes clear_pages() and moves clear_page() function inline (same as PPC64) as it only is a few isns Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:20:11 -06:00
Christophe Leroy	766d45cbee	powerpc/8xx: rewrite flush_instruction_cache() in C On PPC8xx, flushing instruction cache is performed by writing in register SPRN_IC_CST. This registers suffers CPU6 ERRATA. The patch rewrites the fonction in C so that CPU6 ERRATA will be handled transparently Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:20:11 -06:00
Christophe Leroy	a7761fe489	powerpc/8xx: rewrite set_context() in C There is no real need to have set_context() in assembly. Now that we have mtspr() handling CPU6 ERRATA directly, we can rewrite set_context() in C language for easier maintenance. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:20:11 -06:00
Christophe Leroy	63e9e1c28f	powerpc/8xx: remove special handling of CPU6 errata in set_dec() CPU6 ERRATA is now handled directly in mtspr(), so we can use the standard set_dec() fonction in all cases. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:18:03 -06:00
Christophe Leroy	a372acfac5	powerpc/8xx: Map linear kernel RAM with 8M pages On a live running system (VoIP gateway for Air Trafic Control), over a 10 minutes period (with 277s idle), we get 87 millions DTLB misses and approximatly 35 secondes are spent in DTLB handler. This represents 5.8% of the overall time and even 10.8% of the non-idle time. Among those 87 millions DTLB misses, 15% are on user addresses and 85% are on kernel addresses. And within the kernel addresses, 93% are on addresses from the linear address space and only 7% are on addresses from the virtual address space. MPC8xx has no BATs but it has 8Mb page size. This patch implements mapping of kernel RAM using 8Mb pages, on the same model as what is done on the 40x. In 4k pages mode, each PGD entry maps a 4Mb area: we map every two entries to the same 8Mb physical page. In each second entry, we add 4Mb to the page physical address to ease life of the FixupDAR routine. This is just ignored by HW. In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry will point to the first page of the area. The DTLB handler adds the 3 bits from EPN to map the correct page. With this patch applied, we now get only 13 millions TLB misses during the 10 minutes period. The idle time has increased to 313s and the overall time spent in DTLB miss handler is 6.3s, which represents 1% of the overall time and 2.2% of non-idle time. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:18:01 -06:00
Christophe Leroy	913a6b3d10	powerpc/8xx: Save r3 all the time in DTLB miss handler We are spending between 40 and 160 cycles with a mean of 65 cycles in the DTLB handling routine (measured with mftbl) so make it more simple althought it adds one instruction. With this modification, we get three registers available at all time, which will help with following patch. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-11 17:18:01 -06:00
Michael Ellerman	d8c0282f4d	Merge branch 'topic/mprofile-kernel' into next Merge the ftrace changes to support -mprofile-kernel on ppc64le. This is a prerequisite for live patching, the support for which will be merged via the livepatch tree based on this topic branch.	2016-03-11 11:20:15 +11:00
Ingo Molnar	6cbe9e4a22	Merge branch 'linus' into locking/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-10 10:28:27 +01:00
Christophe Leroy	921fff351c	powerpc/8xx: CONFIG_DEBUG_PAGEALLOC requires ITLBmiss for kernel addresses When CONFIG_DEBUG_PAGEALLOC is activated, the initial TLB mapping gets flushed to track accesses to wrong areas. Therefore, kernel addresses will also generate ITLB misses. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-09 10:44:17 -06:00
Paolo Bonzini	ab92f30875	KVM/ARM updates for 4.6 - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - Various optimizations to the vgic save/restore code -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW36xjAAoJECPQ0LrRPXpDGQkQAMDppzcTOixT3e8VPdHAX09a Z5PO0gyTMVV7Jyz5Ul3pedPJA2GSK9mxOCwqvIFbdxLAR6ZB00juO5FrTHkSdI91 1XLPj4bKoMWcVvhL/g5A4Glp/pVMW1k/9Yq8zZAtYlsLRlqG5rLOutSadcqHcYaJ cTD/pFf7b2oPtkTPyoFml75KgHBT/8uvAvFDOWA66Id2z6T11+PsBT/6XnGDiwKg tpGTNzx3kPIKIzOAOHqVW6UBxFOeabebXLT8wUz3VwNn/UbG6gkumMNApMAyF2q1 zU0nAh8+7Ek6Dr4OFWE6BfW6sgg/l7i1lA8XoAmqG7ZTrSptCc59fvaZJxPruG+Q dMsU6QgR77JJjbZTinf9a1jReZ/liZrx2gZXedVKdILrjmDSq0UnGcxjUOEDZOGy 2/dbrlJhv+LhpcJtuPpxPCfoqbW5L0ynzmuYuXRdRz3lTHiOWIRx5gugrhO+wH4D 4gvZhbw3XCiYfpYHYhl8A1EH5kanKgdXDocz9yIm7mZm89gngufF/HkeXS3ZU25T yThyBGulGjqN4FCdgf1HolkTfFjnfSx4qJovJ58eHga+HNLXRkTecZZcbFy2OOHv 8Bx0PIlwj4RgSaRLWQUudAhdhKS2g22DKDDljxFwhkMPNghvqkYMJCRDKLu6GBXQ 4YsLKM+TaShHFjSpx+ao =rpvb -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.6 - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - Various optimizations to the vgic save/restore code Conflicts: include/uapi/linux/kvm.h	2016-03-09 11:50:42 +01:00
Andrew Donnellan	949e9b827e	powerpc/eeh: eeh_pci_enable(): fix checking of post-request state In eeh_pci_enable(), after making the request to set the new options, we call eeh_ops->wait_state() to check that the request finished successfully. At the moment, if eeh_ops->wait_state() returns 0, we return 0 without checking that it reflects the expected outcome. This can lead to callers further up the chain incorrectly assuming the slot has been successfully unfrozen and continuing to attempt recovery. On powernv, this will occur if pnv_eeh_get_pe_state() or pnv_eeh_get_phb_state() return 0, which in turn occurs if the relevant OPAL call returns OPAL_EEH_STOPPED_MMIO_DMA_FREEZE or OPAL_EEH_PHB_ERROR respectively. On pseries, this will occur if pseries_eeh_get_state() returns 0, which in turn occurs if RTAS reports that the PE is in the MMIO Stopped and DMA Stopped states. Obviously, none of these cases represent a successful completion of a request to thaw MMIO or DMA. Fix the check so that a wait_state() return value of 0 won't be considered successful for the EEH_OPT_THAW_MMIO or EEH_OPT_THAW_DMA cases. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 11:33:30 +11:00
Gavin Shan	b6c7347f2f	powerpc/eeh: Remove duplicated check in eeh_dump_pe_log() When eeh_dump_pe_log() is only called by eeh_slot_error_detail(), we already have the check that the PE isn't in PCI config blocked state in eeh_slot_error_detail(). So we needn't the duplicated check in eeh_dump_pe_log(). This removes the duplicated check in eeh_dump_pe_log(). No logical changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 10:25:35 +11:00
Gavin Shan	eca036ee1b	powerpc/eeh: Synchronize recovery in host/guest When passing through SRIOV VFs to guest, we possibly encounter EEH error on PF. In this case, the VF PEs are put into frozen state. The error could be reported to guest before it's captured by the host. That means the guest could attempt to recover errors on VFs before host gets chance to recover errors on PFs. The VFs won't be recovered successfully. This enforces the recovery order for above case: the recovery on child PE in guest is hold until the recovery on parent PE in host is completed. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:28 +11:00
Gavin Shan	3fa7bf7229	powerpc/eeh: Don't remove passed VFs When we have partial hotplug as part of the error recovery on PF, the VFs that are bound with vfio-pci driver will experience hotplug. That's not allowed. This checks if the VF PE is passed or not. If it does, we leave the VF without removing it. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:27 +11:00
Gavin Shan	2311cca555	powerpc/eeh: Don't propagate error to guest When EEH error happened to the parent PE of those PEs that have been passed through to guest, the error is propagated to guest domain and the VFIO driver's error handlers are called. It's not correct as the error in the host domain shouldn't be propagated to guests and affect them. This adds one more limitation when calling EEH error handlers. If the PE has been passed through to guest, the error handlers won't be called. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:25 +11:00
Wei Yang	67086e32b5	powerpc/eeh: powerpc/eeh: Support error recovery for VF PE PFs are enumerated on PCI bus, while VFs are created by PF's driver. In EEH recovery, it has two cases: 1. Device and driver is EEH aware, error handlers are called. 2. Device and driver is not EEH aware, un-plug the device and plug it again by enumerating it. The special thing happens on the second case. For a PF, we could use the original pci core to enumerate the bus, while for VF we need to record the VFs which aer un-plugged then plug it again. Also The patch caches the VF index in pci_dn, which can be used to calculate VF's bus, device and function number. Those information helps to locate the VF's PCI device instance when doing hotplug during EEH recovery if necessary. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:23 +11:00
Wei Yang	9312bc5bab	powerpc/powernv: Support EEH reset for VF PE PEs for VFs don't have primary bus. So they have to have their own reset backend, which is used during EEH recovery. The patch implements the reset backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained in the PE. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:21 +11:00
Wei Yang	c29fa27d26	powerpc/eeh: Create PE for VFs This creates PEs for VFs in the weak function pcibios_bus_add_device(). Those PEs for VFs are identified with newly introduced flag EEH_PE_VF so that we treat them differently during EEH recovery. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:19 +11:00
Wei Yang	39218cd00e	powerpc/eeh: EEH device for VF VFs and their corresponding pdn are created and released dynamically when their PF's SRIOV capability is enabled and disabled. This creates and releases EEH devices for VFs when creating and releasing their pdn instances, which means EEH devices and pdn instances have same life cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn). Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:18 +11:00
Wei Yang	51c0e87e9a	powerpc/eeh: Cache normal BARs, not windows or IOV BARs This restricts the EEH address cache to use only the first 7 BARs. This makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs. As the result of this change, eeh_addr_cache_get_dev() will return VFs from VF's resource addresses instead of parent PFs. This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev() to 7 BARs and this effectively excludes PCI bridges from being cached. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:17 +11:00
Wei Yang	971427f582	powerpc/pci: Remove VFs prior to PF As commit `ac205b7bb7` ("PCI: make sriov work with hotplug remove") indicates, VFs which is on the same PCI bus as their PF, should be removed before the PF. Otherwise, we might run into kernel crash at PCI unplugging time. This applies the above pattern to powerpc PCI hotplug path. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:58:15 +11:00
Gavin Shan	4eb0799ff9	powerpc/eeh: Reworked eeh_pe_bus_get() The original implementation is ugly: unnecessary if statements and "out" tag. This reworks the function to avoid above weaknesses. No functional changes introduced. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 09:57:46 +11:00
Torsten Duwe	153086644f	powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI The gcc switch -mprofile-kernel defines a new ABI for calling _mcount() very early in the function with minimal overhead. Although mprofile-kernel has been available since GCC 3.4, there were bugs which were only fixed recently. Currently it is known to work in GCC 4.9, 5 and 6. Additionally there are two possible code sequences generated by the flag, the first uses mflr/std/bl and the second is optimised to omit the std. Currently only gcc 6 has the optimised sequence. This patch supports both sequences. Initial work started by Vojtech Pavlik, used with permission. Key changes: - rework _mcount() to work for both the old and new ABIs. - implement new versions of ftrace_caller() and ftrace_graph_caller() which deal with the new ABI. - updates to __ftrace_make_nop() to recognise the new mcount calling sequence. - updates to __ftrace_make_call() to recognise the nop'ed sequence. - implement ftrace_modify_call(). - updates to the module loader to surpress the toc save in the module stub when calling mcount with the new ABI. Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-07 14:53:55 +11:00
Torsten Duwe	9a7841ae8d	powerpc/ftrace: Use $(CC_FLAGS_FTRACE) when disabling ftrace Rather than open-coding -pg whereever we want to disable ftrace, use the existing $(CC_FLAGS_FTRACE) variable. This has the advantage that it will work in future when we use a different set of flags to enable ftrace. Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-07 14:53:55 +11:00
Torsten Duwe	c96f83856f	powerpc/ftrace: Use generic ftrace_modify_all_code() Convert powerpc's arch_ftrace_update_code() from its own version to use the generic default functionality (without stop_machine -- our instructions are properly aligned and the replacements atomic). With this we gain error checking and the much-needed function_trace_op handling. Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-07 14:53:54 +11:00
Michael Ellerman	336a7b5dd8	powerpc/module: Create a special stub for ftrace_caller() In order to support the new -mprofile-kernel ABI, we need to be able to call from the module back to ftrace_caller() (in the kernel) without using the module's r2. That is because the function in this module which is calling ftrace_caller() may not have setup r2, if it doesn't otherwise need it (ie. it accesses no globals). To make that work we add a new stub which is used for calling ftrace_caller(), which uses the kernel toc instead of the module toc. Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-07 14:53:54 +11:00
Michael Ellerman	f17c4e01e9	powerpc/module: Mark module stubs with a magic value When a module is loaded, calls out to the kernel go via a stub which is generated at runtime. One of these stubs is used to call _mcount(), which is the default target of tracing calls generated by the compiler with -pg. If dynamic ftrace is enabled (which it typically is), another stub is used to call ftrace_caller(), which is the target of tracing calls when ftrace is actually active. ftrace then wants to disable the calls to _mcount() at module startup, and enable/disable the calls to ftrace_caller() when enabling/disabling tracing - all of these it does by patching the code. As part of that code patching, the ftrace code wants to confirm that the branch it is about to modify, is in fact a call to a module stub which calls _mcount() or ftrace_caller(). Currently it does that by inspecting the instructions and confirming they are what it expects. Although that works, the code to do it is pretty intricate because it requires lots of knowledge about the exact format of the stub. We can make that process easier by marking the generated stubs with a magic value, and then looking for that magic value. Altough this is not as rigorous as the current method, I believe it is sufficient in practice. Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-07 14:53:53 +11:00
Michael Ellerman	136cd3450a	powerpc/module: Only try to generate the ftrace_caller() stub once Currently we generate the module stub for ftrace_caller() at the bottom of apply_relocate_add(). However apply_relocate_add() is potentially called more than once per module, which means we will try to generate the ftrace_caller() stub multiple times. Although the current code deals with that correctly, ie. it only generates a stub the first time, it would be clearer to only try to generate the stub once. Note also on first reading it may appear that we generate a different stub for each section that requires relocation, but that is not the case. The code in stub_for_addr() that searches for an existing stub uses sechdrs[me->arch.stubs_section], ie. the single stub section for this module. A cleaner approach is to only generate the ftrace_caller() stub once, from module_finalize(). Although the original code didn't check to see if the stub was actually generated correctly, it seems prudent to add a check, so do that. And an additional benefit is we can clean the ifdefs up a little. Finally we must propagate the const'ness of some of the pointers passed to module_finalize(), but that is also an improvement. Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-07 14:53:53 +11:00
Michael Ellerman	a5cab83cd3	powerpc: Create a helper for getting the kernel toc value Move the logic to work out the kernel toc pointer into a header. This is a good cleanup, and also means we can use it elsewhere in future. Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Reviewed-by: Torsten Duwe <duwe@suse.de> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>	2016-03-07 14:53:52 +11:00
Linus Torvalds	b8155fe1b2	powerpc fixes for 4.5 #4 - cxl: Fix PSL timebase synchronization detection from Frederic Barrat - Fix oops when destroying hw_breakpoint event from Ravi Bangoria - Avoid lbarx on e5500 from Scott Wood -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIbBAABAgAGBQJW21wIAAoJEFHr6jzI4aWAiTQP+PWe24xwWmC/9FWBhT5rmHbs CO82Wy2X3gjMsQoXGqe48ohfToJAWEt5yXS+H/PKpoTYJ/yFcih2ufRw55Y6xzHs M98c4Rt4mAbGwkmwWTBjSc1zP+ReWeMNwRE69j6N6Xl+fDqGg6tMKrcysh6wdDLB ZhLeHZbVEGhIlXuxMjGyPrMxSrvcEZMMVrxp8eTycC6+wBHB4w4WzOpFx87WM0fn 3g1Q5WWzRjW2iCeWa6tJLwIIzEACrFs8nfJU08mJU1n33YYVbpu+2klG+5ppfGAL JiuKt9/ZoWSAMwNU7yEwFdgjtnOzywTnt6kAUWFdMKLV7DktJ+sf0ucZF9oBg87e ermsXt+Jy591NXgoif5gFMELWouC+Ti9pnqxdOTawYC8TKFm0OGExJFZQMGmzBw1 9ZpO/2pl0bsuNZyClspuFyVRrNb1sxddisSyKlxW+0Vnz1P432fRZZSz2gHLvClG RjQ+nsKZYj6vH0wfA+eG9+/a5iYjbfBKRNSjrO1g8pbc3ELeG5C2kXDePKnkt8+J jpvTLHxaRp44wNPcizPrEthrdlKxE8uvTUosRuSMSJu74SK7t8+9kCvgDnHYne51 f/2Ko7CzAV2SfdKB+nWI2oJxfYpPlxqORZkYGkQZIOD5uQ8pvJ4yCNzy43XqFHFE DyJOYeTxVQmfMT6jkrg= =ZcDj -----END PGP SIGNATURE----- Merge tag 'powerpc-4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - cxl: Fix PSL timebase synchronization detection from Frederic Barrat - Fix oops when destroying hw_breakpoint event from Ravi Bangoria - Avoid lbarx on e5500 from Scott Wood * tag 'powerpc-4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/fsl-book3e: Avoid lbarx on e5500 powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event cxl: Fix PSL timebase synchronization detection	2016-03-06 11:08:06 -08:00
chenhui zhao	6becef7ea0	powerpc/mpc85xx: Add CPU hotplug support for E6500 Support Freescale E6500 core-based platforms, like t4240. Support disabling/enabling individual CPU thread dynamically. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com>	2016-03-04 23:58:38 -06:00
chenhui zhao	2f4f1f815b	powerpc/mpc85xx: Add hotplug support on E5500 and E500MC cores Freescale E500MC and E5500 core-based platforms, like P4080, T1040, support disabling/enabling CPU dynamically. This patch adds this feature on those platforms. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com> [scottwood: removed unused pr_fmt] Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-04 23:56:31 -06:00
chenhui zhao	d17799f9c1	powerpc/rcpm: add RCPM driver There is a RCPM (Run Control/Power Management) in Freescale QorIQ series processors. The device performs tasks associated with device run control and power management. The driver implements some features: mask/unmask irq, enter/exit low power states, freeze time base, etc. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com> [scottwood: remove __KERNEL__ ifdef] Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-04 23:50:27 -06:00
chenhui zhao	e7affb1dba	powerpc/cache: add cache flush operation for various e500 Various e500 core have different cache architecture, so they need different cache flush operations. Therefore, add a callback function cpu_flush_caches to the struct cpu_spec. The cache flush operation for the specific kind of e500 is selected at init time. The callback function will flush all caches inside the current cpu. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Tang Yuantian <Yuantian.Tang@feescale.com> Signed-off-by: Scott Wood <oss@buserror.net>	2016-03-04 23:44:51 -06:00
Ravi Bangoria	fb822e6076	powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event When destroying a hw_breakpoint event, the kernel oopses as follows: Unable to handle kernel paging request for data at address 0x00000c07 NIP [c0000000000291d0] arch_unregister_hw_breakpoint+0x40/0x60 LR [c00000000020b6b4] release_bp_slot+0x44/0x80 Call chain: hw_breakpoint_event_init() bp->destroy = bp_perf_event_destroy; do_exit() perf_event_exit_task() perf_event_exit_task_context() WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE); perf_event_exit_event() free_event() _free_event() bp_perf_event_destroy() // event->destroy(event); release_bp_slot() arch_unregister_hw_breakpoint() perf_event_exit_task_context() sets child_ctx->task as TASK_TOMBSTONE which is (void )-1. arch_unregister_hw_breakpoint() tries to fetch 'thread' attribute of 'task' resulting in oops. Peterz points out that the code shouldn't be using bp->ctx anyway, but fixing that will require a decent amount of rework. So for now to fix the oops, check if bp->ctx->task has been set to (void )-1, before dereferencing it. We don't use TASK_TOMBSTONE, because that would require exporting it and it's supposed to be an internal detail. Fixes: `63b6da39bb` ("perf: Fix perf_event_exit_task() race") Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-03 22:06:08 +11:00
Aneesh Kumar K.V	f64e8084c9	powerpc/mm: Move hash related mmu-*.h headers to book3s/ No code changes. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-03 21:19:21 +11:00
Cyril Bur	bf6a4d5b75	powerpc: Add the ability to save VSX without giving it up This patch adds the ability to be able to save the VSX registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch builds on a previous optimisation for the FPU and VEC registers in the thread copy path to avoid a possibly pointless reload of VSX state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-02 23:34:50 +11:00
Cyril Bur	6f515d842e	powerpc: Add the ability to save Altivec without giving it up This patch adds the ability to be able to save the VEC registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch builds on a previous optimisation for the FPU registers in the thread copy path to avoid a possibly pointless reload of VEC state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-02 23:34:49 +11:00
Cyril Bur	8792468da5	powerpc: Add the ability to save FPU without giving it up This patch adds the ability to be able to save the FPU registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch optimises the thread copy path (as a result of a fork() or clone()) so that the parent thread can return to userspace with hot registers avoiding a possibly pointless reload of FPU register state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-02 23:34:49 +11:00
Cyril Bur	de2a20aa72	powerpc: Prepare for splitting giveup_{fpu, altivec, vsx} in two This prepares for the decoupling of saving {fpu,altivec,vsx} registers and marking {fpu,altivec,vsx} as being unused by a thread. Currently giveup_{fpu,altivec,vsx}() does both however optimisations to task switching can be made if these two operations are decoupled. save_all() will permit the saving of registers to thread structs and leave threads MSR with bits enabled. This patch introduces no functional change. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-02 23:34:48 +11:00
Cyril Bur	70fe3d980f	powerpc: Restore FPU/VEC/VSX if previously used Currently the FPU, VEC and VSX facilities are lazily loaded. This is not a problem unless a process is using these facilities. Modern versions of GCC are very good at automatically vectorising code, new and modernised workloads make use of floating point and vector facilities, even the kernel makes use of vectorised memcpy. All this combined greatly increases the cost of a syscall since the kernel uses the facilities sometimes even in syscall fast-path making it increasingly common for a thread to take an *_unavailable exception soon after a syscall, not to mention potentially taking all three. The obvious overcompensation to this problem is to simply always load all the facilities on every exit to userspace. Loading up all FPU, VEC and VSX registers every time can be expensive and if a workload does avoid using them, it should not be forced to incur this penalty. An 8bit counter is used to detect if the registers have been used in the past and the registers are always loaded until the value wraps to back to zero. Several versions of the assembly in entry_64.S were tested: 1. Always calling C. 2. Performing a common case check and then calling C. 3. A complex check in asm. After some benchmarking it was determined that avoiding C in the common case is a performance benefit (option 2). The full check in asm (option 3) greatly complicated that codepath for a negligible performance gain and the trade-off was deemed not worth it. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> [mpe: Move load_vec in the struct to fill an existing hole, reword change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> fixup	2016-03-02 23:34:48 +11:00
Cyril Bur	d272f6670a	powerpc: Explicitly disable math features when copying thread Currently when threads get scheduled off they always giveup the FPU, Altivec (VMX) and Vector (VSX) units if they were using them. When they are scheduled back on a fault is then taken to enable each facility and load registers. As a result explicitly disabling FPU/VMX/VSX has not been necessary. Future changes and optimisations remove this mandatory giveup and fault which could cause calls such as clone() and fork() to copy threads and run them later with FPU/VMX/VSX enabled but no registers loaded. This patch starts the process of having MSR_{FP,VEC,VSX} mean that a threads registers are hot while not having MSR_{FP,VEC,VSX} means that the registers must be loaded. This allows for a smarter return to userspace. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-02 23:34:47 +11:00
Thomas Gleixner	fc6d73d674	arch/hotplug: Call into idle with a proper state Let the non boot cpus call into idle with the corresponding hotplug state, so the hotplug core can handle the further bringup. That's a first step to convert the boot side of the hotplugged cpus to do all the synchronization with the other side through the state machine. For now it'll only start the hotplug thread and kick the full bringup of the cpu. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: Rik van Riel <riel@redhat.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-03-01 20:36:57 +01:00

1 2 3 4 5 ...

5030 Commits