linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 17:35:17 +07:00

Author	SHA1	Message	Date
Michael Neuling	57a9039052	powerpc/powernv: Only delay opal_rtc_read() retry when necessary Only delay opal_rtc_read() when busy and are going to retry. This has the advantage of possibly saving a massive 10ms off booting! Kudos to Stewart for noticing. Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-27 19:12:40 +11:00
Russell Currey	affddff69c	powerpc/powernv: Add a kmsg_dumper that flushes console output on panic On BMC machines, console output is controlled by the OPAL firmware and is only flushed when its pollers are called. When the kernel is in a panic state, it no longer calls these pollers and thus console output does not completely flush, causing some output from the panic to be lost. Output is only actually lost when the kernel is configured to not power off or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL flushes the console buffer as part of its power down routines. Before this patch, however, only partial output would be printed during the timeout wait. This patch adds a new kmsg_dumper which gets called at panic time to ensure panic output is not lost. It accomplishes this by calling OPAL_CONSOLE_FLUSH in the OPAL API, and if that is not available, the pollers are called enough times to (hopefully) completely flush the buffer. The flushing mechanism will only affect output printed at and before the kmsg_dump call in kernel/panic.c:panic(). As such, the "end Kernel panic" message may still be truncated as follows: >Call Trace: >[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable) >[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4 >[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c >[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254 >[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350 >[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130 >[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac >---[ end Kernel panic - not This functionality is implemented as a kmsg_dumper as it seems to be the most sensible way to introduce platform-specific functionality to the panic function. Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-27 19:12:40 +11:00
Michael Neuling	2fc251a8dd	powerpc: Copy only required pieces of the mm_context_t to the paca Currently we copy the whole mm_context_t to the paca but only access a few bits of it. This is wasteful of space paca and also takes quite some time in the hot path of context switching. This patch pulls in only the required bits from the mm_context_t to the paca and on context switch, copies only those. Benchmarking this (On top of Anton's recent MSR context switching changes [1]) using processes and yield shows an improvement of almost 3% on POWER8: http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield --process 0 0 1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.html Signed-off-by: Michael Neuling <mikey@neuling.org> [mpe: Rename paca fields to be mm_ctx_foo rather than context_foo] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-27 19:12:14 +11:00
Hongtao Jia	3045e409e4	powerpc/mpc85xx: Add TMU device tree support for T1023/T1024 Also add nodes and properties for thermal management support. Meanwhile preprocessor support is needed using thermal of framework. Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-23 22:21:11 -06:00
Hongtao Jia	be489a3936	powerpc/mpc85xx: Add TMU device tree support for T1040/T1042 Also add nodes and properties for thermal management support. Meanwhile preprocessor support is needed using thermal of framework. Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-23 22:21:11 -06:00
Raghav Dogra	479f6a7fc6	powerpc/fsl_lbc: removal of dead code The condition check is not used. Signed-off-by: Raghav Dogra <raghav@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-23 15:02:00 -06:00
Zhao Qiang	b0417a2870	powerpc/p1010rdb: Update dts for pcie interrupt-map p1010rdb uses the irq[4:5] for inta and intb to pcie, it is active-high, so set it. Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 18:23:22 -06:00
Scott Wood	2749539b4b	powerpc/e6500: add locking to hugetlb e6500 has threads but does not have TLB write conditional. Thus, the hugetlb code needs to take the same lock that the normal TLB miss handlers take, to ensure that the tlbsx and tlbwe are atomic. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 18:23:22 -06:00
li pengbo	9d68e7accf	powerpc/85xx: Enable TWR_P102x in mpc85xx_basic_defconfig Enable TWR_P102x option by default in mpc85xx_basic_defconfig to support p1025twr board. Signed-off-by: Pengbo Li <Pengbo.Li@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 18:23:21 -06:00
Daniel Walker	433c858a61	powerpc/85xx: mpc85xx ADS: remove pci exclude This code was reworked in commit, `905e75c46d` This change removed the fsl_add_bridge() which originally was above the addition of the pci_exclude_device function. I think the assumption was that the pci_exclude_device would prevent changes to the bridge PCI config after it's been added. It seems it wasn't fully tested on MPC85xx ADS because if you move the fsl_add_bridge() the pci_exclude_device is set in the machine description then you can never update the PCI Config since the exclude prevents it. This disrupts things like DMA. This issue was extensively debugged by David Beazley. Cc: xe-kernel@external.cisco.com Cc: dbeazley@cisco.com Cc: dwalker@fifo99.com Signed-off-by: Daniel Walker <danielwa@cisco.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 18:23:21 -06:00
Igal Liberman	4e9de5e970	powerpc/mpc85xx: Update B4 FMan MURAM size FMan V3H has 2 different MURAM sizes: In B4860/4420 the MURAM size is 512KB. In T4240 and T2080 the MURAM size is 384KB. The MURAM size in FMan V3H device tree is 384KB. This patch updates the MURAM size for B4 to 512KB. Signed-off-by: Igal Liberman <igal.liberman@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 18:23:11 -06:00
Harninder Rai	720d7aebcd	powerpc/85xx: Add PCIe controller support for bsc9132qds 1. Use machine_arch_initcall to hook mpc85xx_common_publish_devices This can ensure before pcibios_init() is called, pci controllers have been probed and added to the hose_list. 2. Add a workaround for errata A-005434 For the BSC9132, PEX_PEXIWARn[TRGT] for all windows defaults to 0xF, which is mapped to CCSRBAR. However, for other products, 0xF is mapped to the local memory. Therefore, for the BSC9132, any default PCI Express access to the local memory (DDR) will now access the CCSRBAR. This patch changes the mapping of targets of inbound windows PEX_PEXIWARn[TRGT] to the Local address space – 0x0 (from 0xF). Signed-off-by: Harninder Rai <harninder.rai@freescale.com> Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> Signed-off-by: Hou Zhiqiang <B48286@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 18:17:15 -06:00
Harninder Rai	230dd6059a	powerpc/fsl: Add PCI node in device tree of bsc9132qds Signed-off-by: Harninder Rai <harninder.rai@freescale.com> Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> Signed-off-by: Hou Zhiqiang <B48286@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 18:17:14 -06:00
Linus Torvalds	e73a31778a	KVM/ARM fixes for v4.4-rc7 - A series of fixes to the MTRR emulation, tested in the BZ by several users so they should be safe this late - A fix for a division by zero - Two very simple ARM and PPC fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWeV/6AAoJEL/70l94x66DqzsH/05YnLi2GsX5WeZHMfIUgzgT S/GoIkA7A4E2eXVoGg824MWppSViUzZkWgYFQTG4+KY9WPXzm9z2ij7DIlUHCD6n QfevgQx1kIu1obyhm6bYM2xUdM3f7NCsQgw9bXZObB0ay+b/+GjR9/RbCbx60EO5 K1P+kveK6PFlS9/hc0PLztu6WkPV9BCO1RJUbeAEdnrMbpuQfHC+coR7MHRCiv2V iy8f1CqrGaO5YPm9/3GbdH1xMKew4OZShOxTXwtvUThdrLkks2c8sk6FoLzqkznH LMHVIpkm4mrIgThZG7VqZMXOrWvBtsCt04Vr9MzCM6QetB02b/Uz0xKvMYx2kZQ= =pmYz -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: - A series of fixes to the MTRR emulation, tested in the BZ by several users so they should be safe this late - A fix for a division by zero - Two very simple ARM and PPC fixes * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Reload pit counters for all channels when restoring state KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID KVM: MTRR: observe maxphyaddr from guest CPUID, not host KVM: MTRR: fix fixed MTRR segment look up KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX KVM: arm/arm64: vgic: Fix kvm_vgic_map_is_active's dist check kvm: x86: move tracepoints outside extended quiescent state KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR	2015-12-22 15:47:39 -08:00
Zhao Qiang	7aa1aa6ece	QE: Move QE from arch/powerpc to drivers/soc ls1 has qe and ls1 has arm cpu. move qe from arch/powerpc to drivers/soc/fsl to adapt to powerpc and arm Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 17:12:56 -06:00
Zhao Qiang	302c059f2e	QE: use subsys_initcall to init qe Use subsys_initcall to init qe to adapt ARM architecture. Remove qe_reset from PowerPC platform file. Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 17:10:18 -06:00
Zhao Qiang	1291e49e89	QE/CPM: move muram management functions to qe_common QE and CPM have the same muram, they use the same management functions. Now QE support both ARM and PowerPC, it is necessary to move QE to "driver/soc", so move the muram management functions from cpm_common to qe_common for preparing to move QE code to "driver/soc" Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 17:10:18 -06:00
Zhao Qiang	0e6e01ff69	CPM/QE: use genalloc to manage CPM/QE muram Use genalloc to manage CPM/QE muram instead of rheap. Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-12-22 17:10:18 -06:00
Michael Neuling	c395465da6	powerpc: Add function to copy mm_context_t to the paca This adds a function to copy the mm->context to the paca. This is only a basic conversion for now but will be used more extensively in the next patch. This also adds #ifdef CONFIG_PPC_BOOK3S around this code since it's not used elsewhere. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-19 22:13:12 +11:00
Alistair Popple	036592fbbe	powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion" Commit `25642e1459` ("powerpc/opal-irqchip: Fix double endian conversion") fixed an endian bug by calling opal_handle_events() in opal_event_unmask(). However this introduced a deadlock if we find an event is active during unmasking and call opal_handle_events() again. The bad call sequence is: opal_interrupt() -> opal_handle_events() -> generic_handle_irq() -> handle_level_irq() -> raw_spin_lock(&desc->lock) handle_irq_event(desc) unmask_irq(desc) -> opal_event_unmask() -> opal_handle_events() -> generic_handle_irq() -> handle_level_irq() -> raw_spin_lock(&desc->lock) (BOOM) When generating multiple opal events in quick succession this would lead to the following stall warnings: EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32 INFO: rcu_sched detected stalls on CPUs/tasks: 12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065 15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065 (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602) NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696] INFO: rcu_sched detected stalls on CPUs/tasks: 12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371 15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371 (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290) This patch corrects the problem by queuing the work if an event is active during unmasking, which is similar to the pre-endian fix behaviour. Fixes: `25642e1459` ("powerpc/opal-irqchip: Fix double endian conversion") Signed-off-by: Alistair Popple <alistair@popple.id.au> Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-18 22:24:15 +11:00
Daniel Axtens	1b855e167b	powerpc: Add missing calls to va_end() cppcheck picked up that there were a couple of missing va_end() calls in functions using va_start(). Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 23:23:22 +11:00
Nathan Fontenot	e9d764f803	powerpc/pseries: Enable kernel CPU dlpar from sysfs Enable new kernel cpu hotplug functionality by allowing cpu dlpar requests to be initiated from sysfs. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:03 +11:00
Nathan Fontenot	90edf184b9	powerpc/pseries: Add CPU dlpar add functionality Add the ability to hotplug add cpus via rtas hotplug events by either specifying the drc index of the CPU to add, or providing a count of the number of CPUs to add. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:02 +11:00
Nathan Fontenot	ac71380071	powerpc/pseries: Add CPU dlpar remove functionality Add the ability to dlpar remove CPUs via hotplug rtas events, either by specifying the drc-index of the CPU to remove or providing a count of cpus to remove. To remove multiple cpus in a single request we create a list of possible DR (Dynamic Reconfiguration) cpus and their drc indexes that can be removed. We can then traverse the list remove each cpu and easily clean up in any cases of failure. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:02 +11:00
Nathan Fontenot	e666ae0b10	powerpc/pseries: Update CPU hotplug error recovery Update the cpu dlpar add/remove paths to do better error recovery when a failure occurs during the add/remove operation. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:02 +11:00
Nathan Fontenot	d98389f375	powerpc/pseries: Factor out common cpu hotplug code Re-factor the cpu hotplug code to support doing cpu hotplug completely in the kernel and using the existing sysfs probe/release interfaces. This patch pulls out pieces of existing cpu hotplug code into common routines, dlpar_cpu_add() and dlpar_cpu_remove(), to be used by both interfaces. There are no functional changes introduced. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:01 +11:00
Nathan Fontenot	183deeea58	powerpc/pseries: Consolidate CPU hotplug code to hotplug-cpu.c No functional changes, this patch is simply a move of the cpu hotplug code from pseries/dlpar.c to pseries/hotplug-cpu.c. This is in an effort to consolidate all of the cpu hotplug code in a common place. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:01 +11:00
Nathan Fontenot	1f859adb92	powerpc/pseries: Verify CPU doesn't exist before adding When DLPAR adding a CPU we should verify that the CPU does not already exist. Failure to do so can generate a kernel oops; [ 9.465585] kernel BUG at arch/powerpc/platforms/pseries/dlpar.c:382! [ 9.465796] Oops: Exception in kernel mode, sig: 5 [#1] This oops can be generated by causing a probe to be performed on a cpu by writing to the sysfs cpu probe file (/sys/devices/system/cpu/probe). This patch adds a check for the existence of cpu prior to probing the cpu so userspace doing the wrong thing won't trigger a BUG_ON(). Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:01 +11:00
Alistair Popple	4450022b49	powerpc/476fpe: Add support for kexec PPC476FPE has a different PVR from previous PPC476 processors. The kexec code checks the PVR in order to correctly setup the MMU. When the initial support for 476FPE processors was added the corresponding change in the kexec code was missed. This patch simply adds the check and solves the following bug on kexec: kexec: Starting new kernel Bye! Unable to handle kernel paging request for instruction fetch Faulting instruction address: 0xee9a50f8 cpu 0x0: Vector: 400 (Instruction Access) at [ee9d7d20] pc: ee9a50f8 lr: ee9a50e4 sp: ee9d7dd0 msr: 21020 current = 0xee40f000 pid = 960, comm = kexec enter ? for help [link register ] ee9a50e4 [ee9d7dd0] c0013748 default_machine_kexec+0x58/0x70 (unreliable) [ee9d7df0] c0012f04 machine_kexec+0x34/0x40 [ee9d7e00] c00aa1ec kernel_kexec+0x9c/0xb0 [ee9d7e20] c005d704 SyS_reboot+0x1f4/0x220 [ee9d7f40] c000db68 ret_from_syscall+0x0/0x3c Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:00 +11:00
Alistair Popple	5d2aa710e6	powerpc/powernv: Add support for Nvlink NPUs NVLink is a high speed interconnect that is used in conjunction with a PCI-E connection to create an interface between CPU and GPU that provides very high data bandwidth. A PCI-E connection to a GPU is used as the control path to initiate and report status of large data transfers sent via the NVLink. On IBM Power systems the NVLink processing unit (NPU) is similar to the existing PHB3. This patch adds support for a new NPU PHB type. DMA operations on the NPU are not supported as this patch sets the TCE translation tables to be the same as the related GPU PCIe device for each NVLink. Therefore all DMA operations are setup and controlled via the PCIe device. EEH is not presently supported for the NPU devices, although it may be added in future. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:41:00 +11:00
Alistair Popple	a84bf32140	powerpc: Add __raw_rm_writeq() function Move __raw_rm_writeq() from platforms/powernv/pci-ioda.c to include/asm/io.h so that it can be used by other code. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:59 +11:00
Alistair Popple	94973b24d6	Revert "powerpc/pci: Remove unused struct pci_dn.pcidev field" This commit removed the pcidev field from struct pci_dn as it was no longer in use by the kernel. However to support finding the association of Nvlink devices to GPU devices from the device-tree this field is required. This reverts commit `250c7b277c` ("powerpc/pci: Remove unused struct pci_dn.pcidev field"). Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:59 +11:00
Gavin Shan	e80c4e7ca5	powerpc/powernv: Fix M64 resource name in /proc/iomem The name of PCI root bus's M64 resource isn't initialized properly. When dumping "/proc/iomem", "<BAD>" is seen for those M64 resources on PCI root buses. ~# cat /proc/iomem \| grep -e "BAD" 3b0000000000-3b0fefffffff : <BAD> 3b1000000000-3b1fefffffff : <BAD> 3c0000000000-3c0fefffffff : <BAD> 3c1000000000-3c1fefffffff : <BAD> 3c2000000000-3c2fefffffff : <BAD> This fixes the issue by setting the name of PCI root bus's M64 resource to that of PHB's device node full name. With the patch, no "<BAD>" is seen from "/proc/iomem". Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:59 +11:00
Laurent Dufour	7207f43665	powerpc/mm: Add page soft dirty tracking User space checkpoint and restart tool (CRIU) needs the page's change to be soft tracked. This allows to do a pre checkpoint and then dump only touched pages. This is done by using a newly assigned PTE bit (_PAGE_SOFT_DIRTY) when the page is backed in memory, and a new _PAGE_SWP_SOFT_DIRTY bit when the page is swapped out. To introduce a new PTE _PAGE_SOFT_DIRTY bit value common to hash 4k and hash 64k pte, the bits already defined in hash-*4k.h should be shifted left by one. The _PAGE_SWP_SOFT_DIRTY bit is dynamically put after the swap type in the swap pte. A check is added to ensure that the bit is not overwritten by _PAGE_HPTEFLAGS. Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:58 +11:00
Michael Ellerman	2613265cb5	powerpc/kernel: Combine vec/loc for STD_EXCEPTION_PSERIES The STD_EXCEPTION_PSERIES macro takes both a vector number, and a location (memory address). However both are always identical, so combine them to save repeating ourselves. This does mean an exception handler must always exist at the location in memory that matches its vector number. But that's OK because this is the "STD" macro (standard), which does exactly that. We have other macros for the other cases, eg. STD_EXCEPTION_PSERIES_OOL (out of line). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:58 +11:00
Michael Ellerman	d8725ce86c	powerpc/kernel: Open code SET_DEFAULT_THREAD_PPR This is only used in one location, open code it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:57 +11:00
Michael Ellerman	d030a4b5eb	powerpc/kernel: Open code HMT_MEDIUM_LOW_HAS_PPR HMT_MEDIUM_LOW_HAS_PPR is only used in once place, open code it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:57 +11:00
Michael Ellerman	d6265aeaf8	powerpc/kernel: Drop HMT_MEDIUM_PPR_DISCARD HMT_MEDIUM_PPR_DISCARD is a macro which is present at the start of most of our first level exception handlers. It conditionally executes a HMT_MEDIUM instruction, which sets the processor priority to medium. On on modern systems, ie. Power7 and later, it is nop'ed out at boot. All it does is make the exception vectors more cramped, and consume 4 bytes of icache. On old systems it has the effect of boosting the processor priority at the start of exception processing. If we were previously in the idle loop for example, we may be at low or very low priority. This is desirable as we want to process the exception as fast as possible. However looking closely at the generated code, we see that in all cases we execute another HMT_MEDIUM just four instructions later. With code patching applied, the final code on an old (Power6) system will look like, eg: c000000000000300 <data_access_pSeries>: c000000000000300: 7c 42 13 78 mr r2,r2 <- c000000000000304: 7d b2 43 a6 mtsprg 2,r13 c000000000000308: 7d b1 42 a6 mfsprg r13,1 c00000000000030c: f9 2d 00 80 std r9,128(r13) c000000000000310: 60 00 00 00 nop c000000000000314: 7c 42 13 78 mr r2,r2 <- So I suggest that the added code complexity of HMT_MEDIUM_PPR_DISCARD is not justified by the benefit of boosting the processor priority for the duration of four instructions, and therefore we drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:57 +11:00
Michael Ellerman	cd5cdeb6c8	powerpc/rtas: Make enter_rtas() private There are no longer any users of enter_rtas() outside of rtas.c, so make it "private", by moving the declaration inside rtas.c. Hopefully this will encourage people to use one of the wrappers which takes the sharp edges off the RTAS calling sequence. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:56 +11:00
Michael Ellerman	4456f45246	powerpc/rtas: Use rtas_call_unlocked() in call_rtas_display_status() Although call_rtas_display_status() does actually want to use the regular RTAS locking, it doesn't want the extra logic that is in rtas_call(), so currently it open codes the logic. Instead we can use rtas_call_unlocked(), after taking the RTAS lock. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:56 +11:00
Michael Ellerman	b2e8590fa1	powerpc/pseries: Use rtas_call_unlocked() in pseries hotplug Avoid open coding the logic by using rtas_call_unlocked(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:55 +11:00
Michael Ellerman	08eb105a7c	powerpc/xmon: Use rtas_call_unlocked() in xmon Avoid open coding the logic by using rtas_call_unlocked(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:55 +11:00
Michael Ellerman	209eb4e5cb	powerpc/rtas: Add rtas_call_unlocked() Most users of RTAS (Run-Time Abstraction Services) use rtas_call(), which deals with locking as well as endian handling. However we have two users outside of rtas.c that can't use rtas_call() because they have different locking requirements. The hotplug CPU code can't take the RTAS lock because the CPU would go offline with the lock held and no other CPUs would be able to call RTAS until the CPU came back online. The xmon code doesn't want to take the lock because it would risk dead locking when we are trying to recover from a crash. Both sites required multiple patches when we added little endian support, proving that programmers can't do endian right. Although that ship has sailed, we can still clean the code up by providing an unlocked version of rtas_call() which avoids the need to open code the logic elsewhere. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:55 +11:00
Stewart Smith	e4d54f71d2	powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is just OPALv3, with nobody ever expecting anything on pre-OPALv3 to be cared about or supported by mainline kernels. So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL exclusively. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:54 +11:00
Stewart Smith	7261aafc09	powerpc/powernv: Remove OPALv2 firmware define and references OPALv2 only ever existed in the lab and didn't escape to the world. All OPAL systems in the wild are OPALv3. The probability of there being an OPALv2 system still powered on anywhere inside IBM is approximately zero, let alone anyone expecting to run mainline kernels. So, start to remove references to OPALv2. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:54 +11:00
Stewart Smith	786842b62f	powerpc/powernv: panic() on OPAL < V3 The OpenPower Abstraction Layer firmware went through a couple of iterations in the lab before being released. What we now know as OPAL advertises itself as OPALv3. OPALv2 and OPALv1 never made it outside the lab, and the possibility of anyone at all ever building a mainline kernel today and expecting it to boot on such hardware is zero. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 22:40:53 +11:00
Stewart Smith	98da62b716	powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type When running on newer OPAL firmware that supports sending extra OPAL_MSG types, we would print a warning on every message received. This could be a problem for kernels that don't support OPAL_MSG_OCC on machines that are running real close to thermal limits and the OCC is throttling the chip. For a kernel that is paying attention to the message queue, we could get these notifications quite often. Conceivably, future message types could also come fairly often, and printing that we didn't understand them 10,000 times provides no further information than printing them once. Cc: stable@vger.kernel.org Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-17 20:42:13 +11:00
Haren Myneni	6333ed8f26	crypto: nx-842 - Mask XERS0 bit in return value NX842 coprocessor sets 3rd bit in CR register with XER[S0] which is nothing to do with NX request. Since this bit can be set with other valuable return status, mast this bit. One of other bits (INITIATED, BUSY or REJECTED) will be returned for any given NX request. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-12-17 16:42:12 +08:00
Michael Ellerman	2475c36213	Partial revert of "powerpc: Individual System V IPC system calls" This partially reverts commit `a34236155a`. While reviewing the glibc patch to exploit the individual IPC calls, Arnd & Andreas noticed that we were still requiring userspace to pass IPC_64 in order to get the new style IPC API. With a bit of cleanup in the kernel we can drop that requirement, and instead only provide the new style API, which will simplify things for userspace. Rather than try and sneak that patch into 4.4, instead we will drop the individual IPC calls for powerpc, and merge them again in 4.5 once the cleanup patch has gone in. Because we've already added sys_mlock2() as syscall #378, we don't do a full revert of the IPC calls. Instead we drop the __NR #defines, and send those now undefined syscall numbers to sys_ni_syscall(). This leaves a gap in the syscall numbers, but we'll reuse them when we merge the individual IPC calls. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Arnd Bergmann <arnd@arndb.de>	2015-12-16 21:52:32 +11:00
Daniel Axtens	00b912b0c8	powerpc: Remove broken GregorianDay() GregorianDay() is supposed to calculate the day of the week (tm->tm_wday) for a given day/month/year. In that calcuation it indexed into an array called MonthOffset using tm->tm_mon-1. However tm_mon is zero-based, not one-based, so this is off-by-one. It also means that every January, GregoiranDay() will access element -1 of the MonthOffset array. It also doesn't appear to be a correct algorithm either: see in contrast kernel/time/timeconv.c's time_to_tm function. It's been broken forever, which suggests no-one in userland uses this. It looks like no-one in the kernel uses tm->tm_wday either (see e.g. drivers/rtc/rtc-ds1305.c:319). tm->tm_wday is conventionally set to -1 when not available in hardware so we can simply set it to -1 and drop the function. (There are over a dozen other drivers in drivers/rtc that do this.) Found using UBSAN. Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrew Morton <akpm@linux-foundation.org> # as an example of what UBSan finds. Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: rtc-linux@googlegroups.com Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-16 12:54:04 +11:00
Rashmica Gupta	eb925d6460	powerpc/xmon: Append linux_banner to exception information in xmon. Currently if you are in xmon without an oops etc. to view the kernel version you have to type "d $linux_banner" - not necessarily obvious. As this is useful information, append to the output of "e" command. Example output: $mon> e cpu 0x1: Vector: 0 at [c0000000f879ba80] pc: c000000000081718: sysrq_handle_xmon+0x68/0x80 lr: c000000000081718: sysrq_handle_xmon+0x68/0x80 sp: c0000000f879bbe0 msr: 8000000000009033 current = 0xc0000000f604d5c0 paca = 0xc00000000fdc0480 softe: 0 irq_happened: 0x01 pid = 2467, comm = bash Linux version 4.4.0-rc2-00008-gc51af91c3ab3-dirty (rashmica@circle) (gcc version 5.1.1 20150629 (GCC) ) #45 SMP Wed Nov 25 10:25:12 AEDT 2015 Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 20:41:50 +11:00
Rashmica Gupta	24ad1648ed	powerpc/cell: Remove the Cell QPACE code All users of QPACE have upgraded to QPACE2 so remove the Cell QPACE code. Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 20:41:50 +11:00
Vipin K Parashar	b4af279a7c	powerpc/pseries: Limit EPOW reset event warnings Kernel prints respective warnings about various EPOW events for user information/action after parsing EPOW interrupts. At times below EPOW reset event warning is seen to be flooding kernel log over a period of time. May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared These EPOW reset events are spurious in nature and are triggered by firmware without an actual EPOW event being reset. This patch avoids these multiple EPOW reset warnings by using a counter variable. This variable is incremented every time an EPOW event is reported. Upon receiving a EPOW reset event the same variable is checked to filter out spurious events and decremented accordingly. This patch also improves log messages to better describe EPOW event being reported. Merged adjacent log messages into single one to reduce number of lines printed per event. Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 20:41:49 +11:00
Michael Ellerman	1901d8bb45	powerpc fixes for 4.4 #2 - tm: Block signal return from setting invalid MSR state from Michael Neuling - tm: Check for already reclaimed tasks from Michael Neuling -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWVul+AAoJEFHr6jzI4aWAUZAQAK2s+E4WTtcExXSG1bqq05o6 5wCLtIq6M91h5HDpgBfN7S5OXpRd72ZeIVdzC0HkFeLBF3y7NSHSEezUw4g/GGfz K2xGV1CCXC3Rb3qyHSdyi6+c1AnLVRPBVzVxPVmlXigrXeFiQ4613YW9rzf9b8fs oktUciwW9aHbrIv7g8f82gpuk9jwwhp/sF+1H/7fGOozT4CFsKo4wj4HOOCBwH4y ODEjs6Z+9Uwb6Kfvi/rn3k4XA1wC36WFq3ORI6KrmK/ZB1eR0Kwf0IELYpMj8cOX q5ZtCH7t68f9vmEK2B34AUijf/amm+2vLwvF6xAuZJFPUPZtgMBdRcqkLalbtPAO 8hlyPPgoZcgR/Of+lEYxUobcL0SMNufXwmfwRO35ktkm9Z9Ee96C8NNbpybBSDXL YRa6is5MeO4GL8Gbcc0TA50hGjok7o3acGE6HSAReyzf0guQ4xqcif7+6lfWZPkk P3aM02ajp2qoqyjhT/Ei6JlMptAiuQY+HvELFneqn5s9nDbv6cGuYZNNap0c1fK+ 74W0p7MiZh7+IF5HpyUIeYV836inXMDIoKzjA6H3OWitk/1lbcrbF34Qpz9zAWZn YF3w786ZzzQLw0jcALaqZejm58MLGIakO4MNDB0/ZBh0nKKfEV8WvPOjev78OAp1 +pDrJh0iQzrujN6OMKQ0 =IL8A -----END PGP SIGNATURE----- Merge tag 'powerpc-4.4-3' into next Merge the two TM fixes we merged in 4.4. We are about to merge selftests for these, and without the fixes the selftests will oops. powerpc fixes for 4.4 #2 - tm: Block signal return from setting invalid MSR state from Michael Neuling - tm: Check for already reclaimed tasks from Michael Neuling	2015-12-14 20:40:32 +11:00
Michael Neuling	801c0b2c4d	powerpc: Print MSR TM bits in oops messages Print MSR TM bits in oops messages. This appends them to the end like this: MSR: 8000000502823031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[TE]> You get the TM[] only if at least one TM MSR bit is set. Inside the TM[], E means Enabled (bit 32), S means Suspended (bit 33), and T means Transactional (bit 34) If no bits are set, you get no TM[] output. Include rework of printbits() to handle this case. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 20:40:26 +11:00
Boqun Feng	81d7a3294d	powerpc: Make {cmp}xchg* and their atomic_ versions fully ordered According to memory-barriers.txt, xchg, cmpxchg and their atomic_ versions all need to be fully ordered, however they are now just RELEASE+ACQUIRE, which are not fully ordered. So also replace PPC_RELEASE_BARRIER and PPC_ACQUIRE_BARRIER with PPC_ATOMIC_ENTRY_BARRIER and PPC_ATOMIC_EXIT_BARRIER in __{cmp,}xchg_{u32,u64} respectively to guarantee fully ordered semantics of atomic{,64}_{cmp,}xchg() and {cmp,}xchg(), as a complement of commit `b97021f855` ("powerpc: Fix atomic_xxx_return barrier semantics") This patch depends on patch "powerpc: Make value-returning atomics fully ordered" for PPC_ATOMIC_ENTRY_BARRIER definition. Cc: stable@vger.kernel.org # 3.2+ Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 20:39:01 +11:00
Boqun Feng	49e9cf3f0c	powerpc: Make value-returning atomics fully ordered According to memory-barriers.txt: > Any atomic operation that modifies some state in memory and returns > information about the state (old or new) implies an SMP-conditional > general memory barrier (smp_mb()) on each side of the actual > operation ... Which mean these operations should be fully ordered. However on PPC, PPC_ATOMIC_ENTRY_BARRIER is the barrier before the actual operation, which is currently "lwsync" if SMP=y. The leading "lwsync" can not guarantee fully ordered atomics, according to Paul Mckenney: https://lkml.org/lkml/2015/10/14/970 To fix this, we define PPC_ATOMIC_ENTRY_BARRIER as "sync" to guarantee the fully-ordered semantics. This also makes futex atomics fully ordered, which can avoid possible memory ordering problems if userspace code relies on futex system call for fully ordered semantics. Fixes: `b97021f855` ("powerpc: Fix atomic_xxx_return barrier semantics") Cc: stable@vger.kernel.org # 3.2+ Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 20:38:18 +11:00
Aneesh Kumar K.V	4dcbd88eb6	powerpc/mm: Don't open code pgtable_t size The slot information of base page size hash pte is stored in the pgtable_t w.r.t transparent hugepage. We need to make sure we don't index beyond pgtable_t size. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:17 +11:00
Aneesh Kumar K.V	4ad90c8649	powerpc/mm: Use H_READ with H_READ_4 This will bulk read 4 hash pte slot entries and should reduce the loop Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:17 +11:00
Aneesh Kumar K.V	45949ebe6c	powerpc/nohash: we don't use real_pte_t for nohash Remove the related functions and #defines Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:16 +11:00
Aneesh Kumar K.V	cc50380db3	powerpc/nohash: Update 64K nohash config to have 32 pte fragement They don't need to track 4k subpage slot details and hence don't need second half of pgtable_t. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:16 +11:00
Aneesh Kumar K.V	4d9057c39a	powerpc/mm: Don't hardcode the hash pte slot shift Use the #define instead of open-coding the same Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:15 +11:00
Aneesh Kumar K.V	62607bc64c	powerpc/mm: Don't hardcode page table size pte and pmd table size are dependent on config items. Don't hard code the same. This make sure we use the right value when masking pmd entries and also while checking pmd_bad Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:15 +11:00
Aneesh Kumar K.V	6a119eae94	powerpc/mm: Add a _PAGE_PTE bit For a pte entry we will have _PAGE_PTE set. Our pte page address have a minimum alignment requirement of HUGEPD_SHIFT_MASK + 1. We use the lower 7 bits to indicate hugepd. ie. For pmd and pgd we can find: 1) _PAGE_PTE set pte -> indicate PTE 2) bits [2..6] non zero -> indicate hugepd. They also encode the size. We skip bit 1 (_PAGE_PRESENT). 3) othewise pointer to next table. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:14 +11:00
Aneesh Kumar K.V	e34aa03ca4	powerpc/mm: Move THP headers around We support THP only with book3s_64 and 64K page size. Move THP details to hash64-64k.h to clarify the same. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:14 +11:00
Aneesh Kumar K.V	26a344aea4	powerpc/mm: Move hugetlb related headers W.r.t hugetlb, we support two format for pmd. With book3s_64 and 64K linux page size, we can have pte at the pmd level. Hence we don't need to support hugepd there. For everything else hugepd is supported and pmd_huge is (0). Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:13 +11:00
Aneesh Kumar K.V	40e8550afc	powerpc/mm: Move WIMG update to helper. Only difference here is, we apply the WIMG mapping early, so rflags passed to updatepp will also be changed. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:13 +11:00
Aneesh Kumar K.V	c6a3c495f0	powerpc/mm: Add helper for converting pte bit to hpte bits Instead of open coding it in multiple code paths, export the helper and add more documentation. Also make sure we don't make assumption regarding pte bit position Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:12 +11:00
Aneesh Kumar K.V	a43c0eb836	powerpc/mm: Convert 4k insert from asm to C This is similar to 64K insert. May be we want to consolidate Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:12 +11:00
Aneesh Kumar K.V	89ff725051	powerpc/mm: Convert __hash_page_64K to C Convert from asm to C Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:11 +11:00
Aneesh Kumar K.V	227fdbee5a	powerpc/mm: Increase the width of #define No real change, only style changes Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:11 +11:00
Aneesh Kumar K.V	506b863c68	powerpc/mm: Remove pte_val usage for the second half of pgtable_t Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:10 +11:00
Aneesh Kumar K.V	bf680d5160	powerpc/mm: Don't track subpage valid bit in pte_t This free up 11 bits in pte_t. In the later patch we also change the pte_t format so that we can start supporting migration pte at pmd level. We now track 4k subpage valid bit as below If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT and _PAGE_F_SECOND. Together we have 4 bits, each of them used to indicate whether any of the 4 4k subpage in that group is valid. ie, [ group 1 bit ] [ group 2 bit ] ..... [ group 4 ] [ subpage 1 - 4] [ subpage 5- 8] ..... [ subpage 13 - 16] We still track each 4k subpage slot number and secondary hash information in the second half of pgtable_t. Removing the subpage tracking have some significant overhead on aim9 and ebizzy benchmark and to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:10 +11:00
Aneesh Kumar K.V	106713a145	powerpc/mm: Remove the dependency on pte bit position in asm code We should not expect pte bit position in asm code. Simply by moving part of that to C Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:10 +11:00
Aneesh Kumar K.V	91f1da9979	powerpc/mm: Convert 4k hash insert to C Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:09 +11:00
Aneesh Kumar K.V	17ed9e3192	powerpc/booke: Move nohash headers Move the booke related headers below booke/32 or booke/64 Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:09 +11:00
Aneesh Kumar K.V	1ca7212932	powerpc/mm: Move PTE bits from generic functions to hash64 functions. functions which operate on pte bits are moved to hash*.h and other generic functions are moved to pgtable.h Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:08 +11:00
Aneesh Kumar K.V	371352ca0e	powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h This enables us to keep hash64 related bits together, and makes it easy to follow. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:08 +11:00
Aneesh Kumar K.V	f281b5d50c	powerpc/mm: Don't use pmd_val, pud_val and pgd_val as lvalue We convert them static inline function here as we did with pte_val in the previous patch Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:07 +11:00
Aneesh Kumar K.V	10bd3808df	powerpc/mm: Don't use pte_val as lvalue We also convert few #define to static inline in this patch for better type checking Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:07 +11:00
Aneesh Kumar K.V	b0412ea94b	powerpc/mm: Drop pte-common.h from BOOK3S 64 We copy only needed PTE bits define from pte-common.h to respective hash related header. This should greatly simply later patches in which we are going to change the pte format for hash config Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:06 +11:00
Aneesh Kumar K.V	ee4889c7bc	powerpc/mm: Don't have generic headers introduce functions touching pte bits We are going to drop pte_common.h in the later patch. The idea is to enable hash code not require to define all PTE bits. Having PTE bits defined in pte_common.h made the code unnecessarily complex. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:06 +11:00
Aneesh Kumar K.V	cbbb8683fb	powerpc/mm: Delete booke bits from book3s We also move __ASSEMBLY__ towards the end of header. This avoid having #ifndef __ASSEMBLY___ all over the header Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:05 +11:00
Aneesh Kumar K.V	ab537dca2f	powerpc/mm: Move hash specific pte width and other defines to book3s This further make a copy of pte defines to book3s/64/hash*.h. This remove the dependency on pgtable-ppc64-4k.h and pgtable-ppc64-64k.h Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:05 +11:00
Aneesh Kumar K.V	3dfcb315d8	powerpc/mm: make a separate copy for book3s In this patch we do: cp pgtable-ppc32.h book3s/32/pgtable.h cp pgtable-ppc64.h book3s/64/pgtable.h This enable us to do further changes to hash specific config. We will change the page table format for 64bit hash in later patches. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:05 +11:00
Aneesh Kumar K.V	26b6a3d9bb	powerpc/mm: move pte headers to book3s directory Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:04 +11:00
Aneesh Kumar K.V	0863d7f213	powerpc/mm: Fix infinite loop in hash fault with 4K page size This is the same bug we fixed as part of `09567e7fd4` ("powerpc/mm: Check paca psize is up to date for huge mappings"). Please check that for details. The difference here is that faults were happening on a 4K page at an address previously mapped by hugetlb. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-14 15:19:04 +11:00
Scott Wood	666db563d3	EDAC, mpc85xx: Make mpc85xx-pci-edac a platform device Originally the mpc85xx-pci-edac driver bound directly to the PCI controller node. Commit `905e75c46d` ("powerpc/fsl-pci: Unify pci/pcie initialization code") turned the PCI controller code into a platform device. Since we can't have two drivers binding to the same device, the EDAC code was changed to be called into as a library-style submodule. However, this doesn't work if the EDAC driver is built as a module. Commit 8d8fcba6d1ea ("EDAC: Rip out the edac_subsys reference counting") exposed another problem with this approach -- mpc85xx_pci_err_probe() was being called in the same early boot phase that the PCI controller is initialized, rather than in the device_initcall phase that the EDAC layer expects. This caused a crash on boot. To fix this, the PCI controller code now creates a child platform device specifically for EDAC, which the mpc85xx-pci-edac driver binds to. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Daniel Axtens <dja@axtens.net> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Jia Hongtao <B38951@freescale.com> Cc: Jiri Kosina <jkosina@suse.com> Cc: Kim Phillips <kim.phillips@freescale.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Masanari Iida <standby24x7@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rob Herring <robh@kernel.org> Link: http://lkml.kernel.org/r/1449774432-18593-1-git-send-email-scottwood@freescale.com Signed-off-by: Borislav Petkov <bp@suse.de>	2015-12-11 16:56:16 +01:00
Bjorn Helgaas	800e07b609	Merge branches 'pci/aspm', 'pci/hotplug', 'pci/misc' and 'pci/msi' into next * pci/aspm: PCI/ASPM: Make sysfs link_state_store() consistent with link_state_show() * pci/hotplug: PCI: pciehp: Always protect pciehp_disable_slot() with hotplug mutex * pci/misc: x86/PCI: Simplify pci_bios_{read,write} PCI: Simplify config space size computation PCI: Limit config space size for Netronome NFP6000 family PCI: Add Netronome vendor and device IDs PCI: Support PCIe devices with short cfg_size x86/PCI: Clarify AMD Fam10h config access restrictions comment PCI: Print warnings for all invalid expansion ROM headers PCI: Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask * pci/msi: PCI/MSI: Remove empty pci_msi_init_pci_dev() PCI/MSI: Initialize MSI capability for all architectures	2015-12-10 19:40:14 -06:00
Bjorn Helgaas	93de690176	PCI: Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask Bit 7 of the "Header Type" register indicates a multi-function device when set. Bits 0-6 contain encoded values, where 0x1 indicates a PCI-PCI bridge. It is incorrect to test this as though it were a mask. For example, while the PCI 3.0 spec only defines values 0x0, 0x1, and 0x2, it's conceivable that a future spec could define 0x3 to mean something else; then tests for "(hdr_type & 0x7f) & PCI_HEADER_TYPE_BRIDGE" would incorrectly succeed for this new 0x3 header type. Test bits 0-6 of the Header Type for equality with PCI_HEADER_TYPE_BRIDGE. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2015-12-10 19:38:06 -06:00
Anton Blanchard	db1231dcdb	powerpc: Fix DSCR inheritance over fork() Two DSCR tests have a hack in them: /* * XXX: Force a context switch out so that DSCR * current value is copied into the thread struct * which is required for the child to inherit the * changed value. */ sleep(1); We should not be working around this in the testcase, it is a kernel bug. Fix it by copying the current DSCR to the child, instead of what we had in the thread struct at last context switch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-10 21:11:13 +11:00
Anton Blanchard	20dbe67062	powerpc: Call restore_sprs() before _switch() commit `152d523e63` ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") moved the restore of SPRs after the call to _switch(). There is an issue with this approach - new tasks do not return through _switch(), they are set up by copy_thread() to directly return through ret_from_fork() or ret_from_kernel_thread(). This means restore_sprs() is not getting called for new tasks. Fix this by moving restore_sprs() before _switch(). Fixes: `152d523e63` ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-10 21:10:55 +11:00
Anton Blanchard	d64d02ce4e	powerpc: Call check_if_tm_restore_required() in enable_kernel_() Commit `a0e72cf12b` ("powerpc: Create msr_check_and_{set,clear}()") removed a call to check_if_tm_restore_required() in the enable_kernel_() functions. Add them back in. Fixes: `a0e72cf12b` ("powerpc: Create msr_check_and_{set,clear}()") Reported-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-10 20:10:53 +11:00
Thomas Huth	696066f875	KVM: PPC: Increase memslots to 512 Only using 32 memslots for KVM on powerpc is way too low, you can nowadays hit this limit quite fast by adding a couple of PCI devices and/or pluggable memory DIMMs to the guest. x86 already increased the KVM_USER_MEM_SLOTS to 509, to satisfy 256 pluggable DIMM slots, 3 private slots and 253 slots for other things like PCI devices (i.e. resulting in 256 + 3 + 253 = 512 slots in total). We should do something similar for powerpc, and since we do not use private slots here, we can set the value to 512 directly. While we're at it, also remove the KVM_MEM_SLOTS_NUM definition from the powerpc-specific header since this gets defined in the generic kvm_host.h header anyway. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-12-10 11:36:24 +11:00
Paul Mackerras	c20875a3e6	KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR Currently it is possible for userspace (e.g. QEMU) to set a value for the MSR for a guest VCPU which has both of the TS bits set, which is an illegal combination. The result of this is that when we execute a hrfid (hypervisor return from interrupt doubleword) instruction to enter the guest, the CPU will take a TM Bad Thing type of program interrupt (vector 0x700). Now, if PR KVM is configured in the kernel along with HV KVM, we actually handle this without crashing the host or giving hypervisor privilege to the guest; instead what happens is that we deliver a program interrupt to the guest, with SRR0 reflecting the address of the hrfid instruction and SRR1 containing the MSR value at that point. If PR KVM is not configured in the kernel, then we try to run the host's program interrupt handler with the MMU set to the guest context, which almost certainly causes a host crash. This closes the hole by making kvmppc_set_msr_hv() check for the illegal combination and force the TS field to a safe value (00, meaning non-transactional). Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-12-10 11:34:27 +11:00
Al Viro	b808b1d632	don't open-code generic_file_llseek_size() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-12-09 13:00:45 -05:00
Geyslan G. Bem	edfaff269f	KVM: PPC: Book3S PR: Remove unused variable 'vcpu_book3s' The vcpu_book3s variable is assigned but never used. So remove it. Found using cppcheck. Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-12-09 16:06:32 +11:00
Thomas Huth	760a7364f2	KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8 In the old DABR register, the BT (Breakpoint Translation) bit is bit number 61. In the new DAWRX register, the WT (Watchpoint Translation) bit is bit number 59. So to move the DABR-BT bit into the position of the DAWRX-WT bit, it has to be shifted by two, not only by one. This fixes hardware watchpoints in gdb of older guests that only use the H_SET_DABR/X interface instead of the new H_SET_MODE interface. Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-12-09 16:05:01 +11:00
Paul Mackerras	1c9e3d51d5	KVM: PPC: Book3S HV: Handle unexpected traps in guest entry/exit code better As we saw with the TM Bad Thing type of program interrupt occurring on the hrfid that enters the guest, it is not completely impossible to have a trap occurring in the guest entry/exit code, despite the fact that the code has been written to avoid taking any traps. This adds a check in the kvmppc_handle_exit_hv() function to detect the case when a trap has occurred in the hypervisor-mode code, and instead of treating it just like a trap in guest code, we now print a message and return to userspace with a KVM_EXIT_INTERNAL_ERROR exit reason. Of the various interrupts that get handled in the assembly code in the guest exit path and that can return directly to the guest, the only one that can occur when MSR.HV=1 and MSR.EE=0 is machine check (other than system call, which we can avoid just by not doing a sc instruction). Therefore this adds code to the machine check path to ensure that if the MCE occurred in hypervisor mode, we exit to the host rather than trying to continue the guest. Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-12-09 15:46:14 +11:00
Andrew Donnellan	dc9c41bd9e	Revert "powerpc/eeh: Don't unfreeze PHB PE after reset" This reverts commit `527d10ef3a`. The reverted commit breaks cxlflash devices following an EEH reset (and possibly other cxl devices, however this has not been tested). The reverted commit changed the behaviour of eeh_reset_device() so that PHB PEs are not unfrozen following the completion of the reset. This should not be problematic, as no device resources should have been associated with the PHB PE. However, when attempting to load the cxlflash driver after a reset, the driver attempts to read Vital Product Data through a call to pci_read_vpd() (which is called on the physical cxl device, not on the virtual AFU device). pci_read_vpd() in turn attempts to read from the cxl device's config space. This fails, as the PE it's trying to read from is still frozen. In turn, the driver gets an -ENODEV and fails to initialise. It appears this issue only affects some parts of the VPD area, as "lspci -vvv", which only reads a subset of the VPD bytes, is not broken by the original patch. At this stage, we don't fully understand why we're trying to read a frozen PE, and we don't know how this affects other cxl devices. It is possible that there is an underlying bug in the cxl driver or the powerpc CAPI support code, or alternatively a bug in the PCI resource allocation/mapping code that is incorrectly mapping resources to PE#0. As such, this fix is incomplete, however it is necessary to prevent a serious regression in CAPI support. In the meantime, revert the commit, especially as it was intended to be a non-functional change. Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Daniel Axtens <dja@axtens.net> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-09 14:05:10 +11:00
Paul Gortmaker	5b01310cfc	powerpc/sbc8641: drop bogus PHY IRQ entries from DTS file This file was originally cloned off of the MPC8641D-HPCN reference platform, which actually had a PHY IRQ line connected. However this board does not. The bogus entry was largely inert and went undetected until commit `321beec504` ("net: phy: Use interrupts when available in NOLINK state") was added to the tree. With the above commit, the board fails to NFS boot since it sits waiting for a PHY IRQ event that of course never arrives. Removing the bogus entries from the DTS file fixes the issue. Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-09 14:00:39 +11:00
Alistair Popple	25642e1459	powerpc/opal-irqchip: Fix double endian conversion The OPAL event calls return a mask of events that are active in big endian format. This is checked when unmasking the events in the irqchip by comparison with a cached value. The cached value was stored in big endian format but should've been converted to CPU endian first. This bug leads to OPAL event delivery being delayed or dropped on some systems. Symptoms may include a non-functional console. The bug is fixed by calling opal_handle_events(...) instead of duplicating code in opal_event_unmask(...). Fixes: `9f0fd0499d` ("powerpc/powernv: Add a virtual irqchip for opal events") Cc: stable@vger.kernel.org # v4.2+ Reported-by: Douglas L Lehr <dllehr@us.ibm.com> Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-08 16:53:31 +11:00
Rusty Russell	7523e4dc50	module: use a structure to encapsulate layout. Makes it easier to handle init vs core cleanly, though the change is fairly invasive across random architectures. It simplifies the rbtree code immediately, however, while keeping the core data together in the same cachline (now iff the rbtree code is enabled). Acked-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2015-12-04 22:46:25 +01:00
Davidlohr Bueso	d5a73cadf3	lcoking/barriers, arch: Use smp barriers in smp_store_release() With commit `b92b8b35a2` ("locking/arch: Rename set_mb() to smp_store_mb()") it was made clear that the context of this call (and thus set_mb) is strictly for CPU ordering, as opposed to IO. As such all archs should use the smp variant of mb(), respecting the semantics and saving a mandatory barrier on UP. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <linux-arch@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-12-04 11:39:51 +01:00
Anton Blanchard	d1e1cf2e38	powerpc: clean up asm/switch_to.h Remove a bunch of unnecessary fallback functions and group things in a more logical way. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-02 19:34:41 +11:00
Anton Blanchard	f3d885ccba	powerpc: Rearrange __switch_to() Most of __switch_to() is housekeeping, TLB batching, timekeeping etc. Move these away from the more complex and critical context switching code. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-02 19:34:41 +11:00
Anton Blanchard	579e633e76	powerpc: create flush_all_to_thread() Create a single function that flushes everything (FP, VMX, VSX, SPE). Doing this all at once means we only do one MSR write. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-02 19:34:40 +11:00
Anton Blanchard	c208505900	powerpc: create giveup_all() Create a single function that gives everything up (FP, VMX, VSX, SPE). Doing this all at once means we only do one MSR write. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield --fp --altivec --vector 0 0 shows an improvement of 3% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> [mpe: giveup_all() needs to be EXPORT_SYMBOL'ed] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-02 19:34:26 +11:00
Anton Blanchard	1f2e25b2d5	powerpc: Remove fp_enable() and vec_enable(), use msr_check_and_{set, clear}() More consolidation of our MSR available bit handling. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:26 +11:00
Anton Blanchard	3eb5d5888d	powerpc: Add ppc_strict_facility_enable boot option Add a boot option that strictly manages the MSR unavailable bits. This catches kernel uses of FP/Altivec/SPE that would otherwise corrupt user state. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:26 +11:00
Anton Blanchard	dc4fbba11e	powerpc: Create disable_kernel_{fp,altivec,vsx,spe}() The enable_kernel_*() functions leave the relevant MSR bits enabled until we exit the kernel sometime later. Create disable versions that wrap the kernel use of FP, Altivec VSX or SPE. While we don't want to disable it normally for performance reasons (MSR writes are slow), it will be used for a debug boot option that does this and catches bad uses in other areas of the kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:25 +11:00
Anton Blanchard	a0e72cf12b	powerpc: Create msr_check_and_{set,clear}() Create helper functions to set and clear MSR bits after first checking if they are already set. Grouping them will make it easy to avoid the MSR writes in a subsequent optimisation. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:25 +11:00
Anton Blanchard	a7d623d4d0	powerpc: Move part of giveup_vsx into c Move the MSR modification into c. Removing it from the assembly function will allow us to avoid costly MSR writes by batching them up. Check the FP and VMX bits before calling the relevant giveup_*() function. This makes giveup_vsx() and flush_vsx_to_thread() perform more like their sister functions, and allows us to use flush_vsx_to_thread() in the signal code. Move the check_if_tm_restore_required() check in. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:25 +11:00
Anton Blanchard	98da581e08	powerpc: Move part of giveup_fpu,altivec,spe into c Move the MSR modification into new c functions. Removing it from the low level functions will allow us to avoid costly MSR writes by batching them up. Move the check_if_tm_restore_required() check into these new functions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:25 +11:00
Anton Blanchard	b51b1153d0	powerpc: Remove NULL task struct pointer checks in FP and vector code We used to allow giveup_*() to be called with a NULL task struct pointer. Now those cases are handled in the caller we can remove the checks. We can also remove giveup_altivec_notask() which is also unused. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:25 +11:00
Anton Blanchard	611b0e5c19	powerpc: Create mtmsrd_isync() mtmsrd_isync() will do an mtmsrd followed by an isync on older processors. On newer processors we avoid the isync via a feature fixup. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:25 +11:00
Anton Blanchard	b86fd2bd03	powerpc: Simplify TM restore checks Instead of having multiple giveup__maybe_transactional() functions, separate out the TM check into a new function called check_if_tm_restore_required(). This will make it easier to optimise the giveup_() functions in a subsequent patch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:24 +11:00
Anton Blanchard	af1bbc3dd3	powerpc: Remove UP only lazy floating point and vector optimisations The UP only lazy floating point and vector optimisations were written back when SMP was not common, and neither glibc nor gcc used vector instructions. Now SMP is very common, glibc aggressively uses vector instructions and gcc autovectorises. We want to add new optimisations that apply to both UP and SMP, but in preparation for that remove these UP only optimisations. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:24 +11:00
Anton Blanchard	68bfa962bf	powerpc: Remove redundant mflr in _switch No need to execute mflr twice. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:24 +11:00
Anton Blanchard	152d523e63	powerpc: Create context switch helpers save_sprs() and restore_sprs() Move all our context switch SPR save and restore code into two helpers. We do a few optimisations: - Group all mfsprs and all mtsprs. In many cases an mtspr sets a scoreboarding bit that an mfspr waits on, so the current practise of mfspr A; mtspr A; mfpsr B; mtspr B is the worst scheduling we can do. - SPR writes are slow, so check that the value is changing before writing it. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield 0 0 shows an improvement of almost 10% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:24 +11:00
Anton Blanchard	af72ab646a	powerpc: Don't disable MSR bits in do_load_up_transact_() functions Similar to the non TM load_up_() functions, don't disable the MSR bits on the way out. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:24 +11:00
Anton Blanchard	07e45c120c	powerpc: Don't disable kernel FP/VMX/VSX MSR bits on context switch Writing the MSR is slow, so we want to avoid it whenever possible. A subsequent patch will add a debug option that strictly manages the FP/VMX/VSX unavailable bits. For now just remove it, matching what we do in other areas of the kernel (eg enable_kernel_altivec()). A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield --fp 0 0 shows an improvement of almost 3% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:24 +11:00
Paul Mackerras	31a40e2b05	powerpc/64: Include KVM guest test in all interrupt vectors Currently, if HV KVM is configured but PR KVM isn't, we don't include a test to see whether we were interrupted in KVM guest context for the set of interrupts which get delivered directly to the guest by hardware if they occur in the guest. This includes things like program interrupts. However, the recent bug where userspace could set the MSR for a VCPU to have an illegal value in the TS field, and thus cause a TM Bad Thing type of program interrupt on the hrfid that enters the guest, showed that we can never be completely sure that these interrupts can never occur in the guest entry/exit code. If one of these interrupts does happen and we have HV KVM configured but not PR KVM, then we end up trying to run the handler in the host with the MMU set to the guest MMU context, which generally ends badly. Thus, for robustness it is better to have the test in every interrupt vector, so that if some way is found to trigger some interrupt in the guest entry/exit path, we can handle it without immediately crashing the host. This means that the distinction between KVMTEST and KVMTEST_PR goes away. Thus we delete KVMTEST_PR and associated macros and use KVMTEST everywhere that we previously used either KVMTEST_PR or KVMTEST. It also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR, so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-12-01 13:52:23 +11:00
David Hildenbrand	e09fefdeeb	KVM: Use common function for VCPU lookup by id Let's reuse the new common function for VPCU lookup by id. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [split out the new function into a separate patch]	2015-11-30 12:47:04 +01:00
Luis de Bethencourt	b4f8144559	powerpc/axonram: Fix module autoload for OF platform driver This platform driver has a OF device ID table but the OF module alias information is not created so module autoloading won't work. Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-26 22:11:18 +11:00
Rashmica Gupta	343c3327c1	powerpc: Add rN aliases to the pt_regs_offset table. It is common practice with powerpc to use 'rN' to refer to register 'N'. However when using the pt_regs_offset table we have to use 'gprN'. So add aliases such that both 'rN' and 'gprN' can be used. For example, we can currently do: $ su - $ cd /sys/kernel/debug/tracing $ echo "p:probe/sys_fchownat sys_fchownat %gpr3:s32 +0(%gpr4):string %gpr5:s32 %gpr6:s32 %gpr7:s32" > kprobe_events $ echo 1 > events/probe/sys_fchownat/enable $ touch /tmp/foo $ chown root /tmp/foo $ echo 0 > events/enable $ cat trace chown-2925 [014] d... 76.160657: sys_fchownat: (SyS_fchownat+0x8/0x1a0) arg1=-100 arg2="/tmp/foo" arg3=0 arg4=-1 arg5=0 Instead we'd like to be able to use: $ echo "p:probe/sys_fchownat sys_fchownat %r3:s32 +0(%r4):string %r5:s32 %r6:s32 %r7:s32" > kprobe_events Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-26 22:11:17 +11:00
Rashmica Gupta	f43194e458	powerpc: Standardise on NR_syscalls rather than __NR_syscalls. Most architectures use NR_syscalls as the #define for the number of syscalls. We use __NR_syscalls, and then define NR_syscalls as __NR_syscalls. __NR_syscalls is not used outside arch code, whereas NR_syscalls is. So as NR_syscalls must be defined and __NR_syscalls does not, replace __NR_syscalls with NR_syscalls. Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-26 22:11:17 +11:00
Rashmica Gupta	cdfc8ed690	powerpc: Remove unused function trace_syscall() This function has been unused since commit `14cf11af6c` ("powerpc: Merge enough to start building in arch/powerpc."), so remove it. Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-26 22:11:16 +11:00
Laurent Vivier	58531b0c80	powerpc/boot: allow wrapper to work on non-english system if the language is not english objdump output is not parsed correctly and format is "". Later, "ld -m $format" fails. This patch adds "LANG=C" to force english output for objdump. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-26 22:11:16 +11:00
John Ogness	57f889471c	powerpc/powermac: set IRQF_NO_THREAD for xmon/cascade handlers The xmon and cascade irq handlers must not run as threads. pmac_pic_lock is already a raw_spinlock, but the irq flag IRQF_NO_THREAD needs to be set as well. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-26 22:11:05 +11:00
Yaowei Bai	378b417d65	KVM: powerpc: kvmppc_visible_gpa can be boolean In another patch kvm_is_visible_gfn is maken return bool due to this function only returns zero or one as its return value, let's also make kvmppc_visible_gpa return bool to keep consistent. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-11-25 17:24:24 +01:00
Guilherme G. Piccoli	e80e7edc55	PCI/MSI: Initialize MSI capability for all architectures `1851617cd2` ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI") moved dev->msi_cap and dev->msix_cap initialization from the pci_init_capabilities() path (used on all architectures) to the pci_setup_device() path (not used on Open Firmware architectures). This broke MSI or MSI-X on Open Firmware machines. `4d9aac397a` ("powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case") fixed it for PowerPC but not for SPARC. Set up MSI and MSI-X (initialize msi_cap and msix_cap and disable MSI and MSI-X) in pci_init_capabilities() so all architectures do it the same way. This reverts `4d9aac397a` since this patch fixes the problem generically for both PowerPC and SPARC. [bhelgaas: changelog, make pci_msi_setup_pci_dev() static] Fixes: `1851617cd2` ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI") Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2015-11-24 17:45:18 -06:00
Krzysztof Kozlowski	87630eb1d5	powerpc/powernv: Drop owner assignment from platform_driver platform_driver does not need to set an owner because platform_driver_register() will set it. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-24 14:21:28 +11:00
Michael Neuling	7f821fc9c7	powerpc/tm: Check for already reclaimed tasks Currently we can hit a scenario where we'll tm_reclaim() twice. This results in a TM bad thing exception because the second reclaim occurs when not in suspend mode. The scenario in which this can happen is the following. We attempt to deliver a signal to userspace. To do this we need obtain the stack pointer to write the signal context. To get this stack pointer we must tm_reclaim() in case we need to use the checkpointed stack pointer (see get_tm_stackpointer()). Normally we'd then return directly to userspace to deliver the signal without going through __switch_to(). Unfortunatley, if at this point we get an error (such as a bad userspace stack pointer), we need to exit the process. The exit will result in a __switch_to(). __switch_to() will attempt to save the process state which results in another tm_reclaim(). This tm_reclaim() now causes a TM Bad Thing exception as this state has already been saved and the processor is no longer in TM suspend mode. Whee! This patch checks the state of the MSR to ensure we are TM suspended before we attempt the tm_reclaim(). If we've already saved the state away, we should no longer be in TM suspend mode. This has the additional advantage of checking for a potential TM Bad Thing exception. Found using syscall fuzzer. Fixes: `fb09692e71` ("powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-23 20:18:03 +11:00
Michael Neuling	d2b9d2a5ad	powerpc/tm: Block signal return setting invalid MSR state Currently we allow both the MSR T and S bits to be set by userspace on a signal return. Unfortunately this is a reserved configuration and will cause a TM Bad Thing exception if attempted (via rfid). This patch checks for this case in both the 32 and 64 bit signals code. If both T and S are set, we mark the context as invalid. Found using a syscall fuzzer. Fixes: `2b0a576d15` ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-23 20:06:31 +11:00
Michael Ellerman	1451ad03fa	powerpc: Wire up sys_mlock2() The selftest passes on 64-bit LE and 32-bit BE. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-16 17:05:53 +11:00
Linus Torvalds	3370b69eb0	Four changes: - x86: work around two nasty cases where a benign exception occurs while another is being delivered. The endless stream of exceptions causes an infinite loop in the processor, which not even NMIs or SMIs can interrupt; in the virt case, there is no possibility to exit to the host either. - x86: support for Skylake per-guest TSC rate. Long supported by AMD, the patches mostly move things from there to common arch/x86/kvm/ code. - generic: remove local_irq_save/restore from the guest entry and exit paths when context tracking is enabled. The patches are a few months old, but we discussed them again at kernel summit. Andy will pick up from here and, in 4.5, try to remove it from the user entry/exit paths. - PPC: Two bug fixes, see merge commit `370289756b` for details. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWRFb0AAoJEL/70l94x66DjjMH/31jr8d119MW0uv2x+03+wRq 6dbJ8tjQ8grvBRExKvLsUVjDmHlhCa1BQl5qjCsyYhX9UeAf4NQOmoEFpq+YTLxh Ctveyn+yiZWC7qxbQDmauiQ4JCOp+W9ial782iqw5+ouQMajGOffq5WrojCa2ZNF jI278JgdHJLrKj/uie//WBu3V7MJY5Apc3p4zatnSYFSQ3MA0sxl4r4zIrwOa5qs 23ZeeoqbP4sHh4X5wL/30Y6XFSCHj0qoYHHyAgzLi0PCMvBdt4DrAFUPDG/Rhlv6 o1WB/kcUfcz3DtBX85wfSOMuw0nF6patWhWv07R/3EIbYoz3dKvp9d6ORYgXqlY= =Um9M -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull second batch of kvm updates from Paolo Bonzini: "Four changes: - x86: work around two nasty cases where a benign exception occurs while another is being delivered. The endless stream of exceptions causes an infinite loop in the processor, which not even NMIs or SMIs can interrupt; in the virt case, there is no possibility to exit to the host either. - x86: support for Skylake per-guest TSC rate. Long supported by AMD, the patches mostly move things from there to common arch/x86/kvm/ code. - generic: remove local_irq_save/restore from the guest entry and exit paths when context tracking is enabled. The patches are a few months old, but we discussed them again at kernel summit. Andy will pick up from here and, in 4.5, try to remove it from the user entry/exit paths. - PPC: Two bug fixes, see merge commit `370289756b` for details" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits) KVM: x86: rename update_db_bp_intercept to update_bp_intercept KVM: svm: unconditionally intercept #DB KVM: x86: work around infinite loop in microcode when #AC is delivered context_tracking: avoid irq_save/irq_restore on guest entry and exit context_tracking: remove duplicate enabled check KVM: VMX: Dump TSC multiplier in dump_vmcs() KVM: VMX: Use a scaled host TSC for guest readings of MSR_IA32_TSC KVM: VMX: Setup TSC scaling ratio when a vcpu is loaded KVM: VMX: Enable and initialize VMX TSC scaling KVM: x86: Use the correct vcpu's TSC rate to compute time scale KVM: x86: Move TSC scaling logic out of call-back read_l1_tsc() KVM: x86: Move TSC scaling logic out of call-back adjust_tsc_offset() KVM: x86: Replace call-back compute_tsc_offset() with a common function KVM: x86: Replace call-back set_tsc_khz() with a common function KVM: x86: Add a common TSC scaling function KVM: x86: Add a common TSC scaling ratio field in kvm_vcpu_arch KVM: x86: Collect information for setting TSC scaling ratio KVM: x86: declare a few variables as __read_mostly KVM: x86: merge handle_mmio_page_fault and handle_mmio_page_fault_common KVM: PPC: Book3S HV: Don't dynamically split core when already split ...	2015-11-12 14:34:06 -08:00
Linus Torvalds	3419b45039	Merge branch 'for-4.4/io-poll' of git://git.kernel.dk/linux-block Pull block IO poll support from Jens Axboe: "Various groups have been doing experimentation around IO polling for (really) fast devices. The code has been reviewed and has been sitting on the side for a few releases, but this is now good enough for coordinated benchmarking and further experimentation. Currently O_DIRECT sync read/write are supported. A framework is in the works that allows scalable stats tracking so we can auto-tune this. And we'll add libaio support as well soon. Fow now, it's an opt-in feature for test purposes" * 'for-4.4/io-poll' of git://git.kernel.dk/linux-block: direct-io: be sure to assign dio->bio_bdev for both paths directio: add block polling support NVMe: add blk polling support block: add block polling support blk-mq: return tag/queue combo in the make_request_fn handlers block: change ->make_request_fn() and users to return a queue cookie	2015-11-10 17:23:49 -08:00
Nicolas Pitre	77c5b5da02	kmap_atomic_to_page() has no users, remove it Removal started in commit `5bbeed12bd` ("sparc32: drop unused kmap_atomic_to_page"). Let's do it across the whole tree. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-09 15:11:24 -08:00
Jens Axboe	dece16353e	block: change ->make_request_fn() and users to return a queue cookie No functional changes in this patch, but it prepares us for returning a more useful cookie related to the IO that was queued up. Signed-off-by: Jens Axboe <axboe@fb.com> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com>	2015-11-07 10:40:46 -07:00
Linus Torvalds	3c87b79188	PCI changes for the v4.4 merge window: Resource management Add support for Enhanced Allocation devices (Sean O. Stalley) Add Enhanced Allocation register entries (Sean O. Stalley) Handle IORESOURCE_PCI_FIXED when sizing resources (David Daney) Handle IORESOURCE_PCI_FIXED when assigning resources (David Daney) Handle Enhanced Allocation capability for SR-IOV devices (David Daney) Clear IORESOURCE_UNSET when reverting to firmware-assigned address (Bjorn Helgaas) Make Enhanced Allocation bitmasks more obvious (Bjorn Helgaas) Expand Enhanced Allocation BAR output (Bjorn Helgaas) Add of_pci_check_probe_only to parse "linux,pci-probe-only" (Marc Zyngier) Fix lookup of linux,pci-probe-only property (Marc Zyngier) Add sparc mem64 resource parsing for root bus (Yinghai Lu) PCI device hotplug pciehp: Queue power work requests in dedicated function (Guenter Roeck) Driver binding Add builtin_pci_driver() to avoid registration boilerplate (Paul Gortmaker) Virtualization Set SR-IOV NumVFs to zero after enumeration (Alexander Duyck) Remove redundant validation of SR-IOV offset/stride registers (Alexander Duyck) Remove VFs in reverse order if virtfn_add() fails (Alexander Duyck) Reorder pcibios_sriov_disable() (Alexander Duyck) Wait 1 second between disabling VFs and clearing NumVFs (Alexander Duyck) Fix sriov_enable() error path for pcibios_enable_sriov() failures (Alexander Duyck) Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs (Ben Shelton) Don't try to restore VF BARs (Wei Yang) MSI Don't alloc pcibios-irq when MSI is enabled (Joerg Roedel) Add msi_controller setup_irqs() method for special multivector setup (Lucas Stach) Export all remapped MSIs to sysfs attributes (Romain Bezut) Disable MSI on SiS 761 (Ondrej Zary) AER Clear error status registers during enumeration and restore (Taku Izumi) Generic host bridge driver Fix lookup of linux,pci-probe-only property (Marc Zyngier) Allow multiple hosts with different map_bus() methods (David Daney) Pass starting bus number to pci_scan_root_bus() (David Daney) Fix address window calculation for non-zero starting bus (David Daney) Altera host bridge driver Add msi.h to ARM Kbuild (Ley Foon Tan) Add Altera PCIe host controller driver (Ley Foon Tan) Add Altera PCIe MSI driver (Ley Foon Tan) APM X-Gene host bridge driver Remove msi_controller assignment (Duc Dang) Broadcom iProc host bridge driver Fix header comment "Corporation" misspelling (Florian Fainelli) Fix code comment to match code (Ray Jui) Remove unused struct iproc_pcie.irqs[] (Ray Jui) Call pci_fixup_irqs() for ARM64 as well as ARM (Ray Jui) Fix PCIe reset logic (Ray Jui) Improve link detection logic (Ray Jui) Update PCIe device tree bindings (Ray Jui) Add outbound mapping support (Ray Jui) Freescale i.MX6 host bridge driver Return real error code from imx6_add_pcie_port() (Fabio Estevam) Add PCIE_PHY_RX_ASIC_OUT_VALID definition (Fabio Estevam) Freescale Layerscape host bridge driver Remove ls_pcie_establish_link() (Minghuan Lian) Ignore PCIe controllers in Endpoint mode (Minghuan Lian) Factor out SCFG related function (Minghuan Lian) Update ls_add_pcie_port() (Minghuan Lian) Remove unused fields from struct ls_pcie (Minghuan Lian) Add support for LS1043a and LS2080a (Minghuan Lian) Add ls_pcie_msi_host_init() (Minghuan Lian) HiSilicon host bridge driver Add HiSilicon SoC Hip05 PCIe driver (Zhou Wang) Marvell MVEBU host bridge driver Return zero for reserved or unimplemented config space (Russell King) Use exact config access size; don't read/modify/write (Russell King) Use of_get_available_child_count() (Russell King) Use for_each_available_child_of_node() to walk child nodes (Russell King) Report full node name when reporting a DT error (Russell King) Use port->name rather than "PCIe%d.%d" (Russell King) Move port parsing and resource claiming to separate function (Russell King) Fix memory leaks and refcount leaks (Russell King) Split port parsing and resource claiming from port setup (Russell King) Use gpio_set_value_cansleep() (Russell King) Use devm_kcalloc() to allocate an array (Russell King) Use gpio_desc to carry around gpio (Russell King) Improve clock/reset handling (Russell King) Add PCI Express root complex capability block (Russell King) Remove code restricting accesses to slot 0 (Russell King) NVIDIA Tegra host bridge driver Wrap static pgprot_t initializer with __pgprot() (Ard Biesheuvel) Renesas R-Car host bridge driver Build pci-rcar-gen2.c only on ARM (Geert Uytterhoeven) Build pcie-rcar.c only on ARM (Geert Uytterhoeven) Make PCI aware of the I/O resources (Phil Edworthy) Remove dependency on ARM-specific struct hw_pci (Phil Edworthy) Set root bus nr to that provided in DT (Phil Edworthy) Fix I/O offset for multiple host bridges (Phil Edworthy) ST Microelectronics SPEAr13xx host bridge driver Fix dw_pcie_cfg_read/write() usage (Gabriele Paoloni) Synopsys DesignWare host bridge driver Make "clocks" and "clock-names" optional DT properties (Bhupesh Sharma) Use exact access size in dw_pcie_cfg_read() (Gabriele Paoloni) Simplify dw_pcie_cfg_read/write() interfaces (Gabriele Paoloni) Require config accesses to be naturally aligned (Gabriele Paoloni) Make "num-lanes" an optional DT property (Gabriele Paoloni) Move calculation of bus addresses to DRA7xx (Gabriele Paoloni) Replace ARM pci_sys_data->align_resource with global function pointer (Gabriele Paoloni) Factor out MSI msg setup (Lucas Stach) Implement multivector MSI IRQ setup (Lucas Stach) Make get_msi_addr() return phys_addr_t, not u32 (Lucas Stach) Set up high part of MSI target address (Lucas Stach) Fix PORT_LOGIC_LINK_WIDTH_MASK (Zhou Wang) Revert "PCI: designware: Program ATU with untranslated address" (Zhou Wang) Use of_pci_get_host_bridge_resources() to parse DT (Zhou Wang) Make driver arch-agnostic (Zhou Wang) Miscellaneous Make x86 pci_subsys_init() static (Alexander Kuleshov) Turn off Request Attributes to avoid Chelsio T5 Completion erratum (Hariprasad Shenai) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWPM9aAAoJEFmIoMA60/r89f8QALFMHegqKk5M08ZcCaG7unLF 5U5t88y6KF/kNYF6HOqLQiHh79U3ToU5HdrNaAtutr0UnvgbFC2WulLqKYgLiq6Y YJnwR3EfgGmdG7DKAVAXq19+nc2hgTPAEe8sciU7HKlTbqQmGj6//y3sQULGNLx3 zur0C33DCtrDgKDP7to273lkHO8Vl0YuLzyqwt0ePCMNcXR0h1dK8QxSTjuXBxaR c+T1V1Hw64MTxLz3xJd1/1ipy32u+9LnqqcdRz0zRq6qi48G9ch/i4Z6DHa8kTbj DUZrrTYKILQ2TcjcZSBmTueX11Z1Xa4/I/45sehIi6gVWL9qQbmGpt2E5YtFED+4 GdcmBSbWG/qsNsabXk38uiM3ww7+ltXEOhTXbcK+EgjvIhE6gSK/plYG0fU9pybs AKViEXVdHoT1X0N1dLK12mq7kvDCQvShHn08lz97Q9YrZ32wv1Fnij6WVSbJvfWt DubtPtisVM+rVy+VTpOImNR9wO54lTmG5jK53yNqH7I20K89y1kqARlN9nMXMB1a 2nQnwe9yWlsGj9gVNCn1KmyQSPOWjg+3Z+ekfwbxpca14s1AaN3Jm0N9Z61dXFoF y2ygoQtZ/z9BHr3quBpxXGt+aVUg2kcNw5GYeDYiALxXdJSObyzRrZ6HDb/zicU2 ZH9hBj0ctXvucmy6I2mt =uZrt -----END PGP SIGNATURE----- Merge tag 'pci-v4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Resource management: - Add support for Enhanced Allocation devices (Sean O. Stalley) - Add Enhanced Allocation register entries (Sean O. Stalley) - Handle IORESOURCE_PCI_FIXED when sizing resources (David Daney) - Handle IORESOURCE_PCI_FIXED when assigning resources (David Daney) - Handle Enhanced Allocation capability for SR-IOV devices (David Daney) - Clear IORESOURCE_UNSET when reverting to firmware-assigned address (Bjorn Helgaas) - Make Enhanced Allocation bitmasks more obvious (Bjorn Helgaas) - Expand Enhanced Allocation BAR output (Bjorn Helgaas) - Add of_pci_check_probe_only to parse "linux,pci-probe-only" (Marc Zyngier) - Fix lookup of linux,pci-probe-only property (Marc Zyngier) - Add sparc mem64 resource parsing for root bus (Yinghai Lu) PCI device hotplug: - pciehp: Queue power work requests in dedicated function (Guenter Roeck) Driver binding: - Add builtin_pci_driver() to avoid registration boilerplate (Paul Gortmaker) Virtualization: - Set SR-IOV NumVFs to zero after enumeration (Alexander Duyck) - Remove redundant validation of SR-IOV offset/stride registers (Alexander Duyck) - Remove VFs in reverse order if virtfn_add() fails (Alexander Duyck) - Reorder pcibios_sriov_disable() (Alexander Duyck) - Wait 1 second between disabling VFs and clearing NumVFs (Alexander Duyck) - Fix sriov_enable() error path for pcibios_enable_sriov() failures (Alexander Duyck) - Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs (Ben Shelton) - Don't try to restore VF BARs (Wei Yang) MSI: - Don't alloc pcibios-irq when MSI is enabled (Joerg Roedel) - Add msi_controller setup_irqs() method for special multivector setup (Lucas Stach) - Export all remapped MSIs to sysfs attributes (Romain Bezut) - Disable MSI on SiS 761 (Ondrej Zary) AER: - Clear error status registers during enumeration and restore (Taku Izumi) Generic host bridge driver: - Fix lookup of linux,pci-probe-only property (Marc Zyngier) - Allow multiple hosts with different map_bus() methods (David Daney) - Pass starting bus number to pci_scan_root_bus() (David Daney) - Fix address window calculation for non-zero starting bus (David Daney) Altera host bridge driver: - Add msi.h to ARM Kbuild (Ley Foon Tan) - Add Altera PCIe host controller driver (Ley Foon Tan) - Add Altera PCIe MSI driver (Ley Foon Tan) APM X-Gene host bridge driver: - Remove msi_controller assignment (Duc Dang) Broadcom iProc host bridge driver: - Fix header comment "Corporation" misspelling (Florian Fainelli) - Fix code comment to match code (Ray Jui) - Remove unused struct iproc_pcie.irqs[] (Ray Jui) - Call pci_fixup_irqs() for ARM64 as well as ARM (Ray Jui) - Fix PCIe reset logic (Ray Jui) - Improve link detection logic (Ray Jui) - Update PCIe device tree bindings (Ray Jui) - Add outbound mapping support (Ray Jui) Freescale i.MX6 host bridge driver: - Return real error code from imx6_add_pcie_port() (Fabio Estevam) - Add PCIE_PHY_RX_ASIC_OUT_VALID definition (Fabio Estevam) Freescale Layerscape host bridge driver: - Remove ls_pcie_establish_link() (Minghuan Lian) - Ignore PCIe controllers in Endpoint mode (Minghuan Lian) - Factor out SCFG related function (Minghuan Lian) - Update ls_add_pcie_port() (Minghuan Lian) - Remove unused fields from struct ls_pcie (Minghuan Lian) - Add support for LS1043a and LS2080a (Minghuan Lian) - Add ls_pcie_msi_host_init() (Minghuan Lian) HiSilicon host bridge driver: - Add HiSilicon SoC Hip05 PCIe driver (Zhou Wang) Marvell MVEBU host bridge driver: - Return zero for reserved or unimplemented config space (Russell King) - Use exact config access size; don't read/modify/write (Russell King) - Use of_get_available_child_count() (Russell King) - Use for_each_available_child_of_node() to walk child nodes (Russell King) - Report full node name when reporting a DT error (Russell King) - Use port->name rather than "PCIe%d.%d" (Russell King) - Move port parsing and resource claiming to separate function (Russell King) - Fix memory leaks and refcount leaks (Russell King) - Split port parsing and resource claiming from port setup (Russell King) - Use gpio_set_value_cansleep() (Russell King) - Use devm_kcalloc() to allocate an array (Russell King) - Use gpio_desc to carry around gpio (Russell King) - Improve clock/reset handling (Russell King) - Add PCI Express root complex capability block (Russell King) - Remove code restricting accesses to slot 0 (Russell King) NVIDIA Tegra host bridge driver: - Wrap static pgprot_t initializer with __pgprot() (Ard Biesheuvel) Renesas R-Car host bridge driver: - Build pci-rcar-gen2.c only on ARM (Geert Uytterhoeven) - Build pcie-rcar.c only on ARM (Geert Uytterhoeven) - Make PCI aware of the I/O resources (Phil Edworthy) - Remove dependency on ARM-specific struct hw_pci (Phil Edworthy) - Set root bus nr to that provided in DT (Phil Edworthy) - Fix I/O offset for multiple host bridges (Phil Edworthy) ST Microelectronics SPEAr13xx host bridge driver: - Fix dw_pcie_cfg_read/write() usage (Gabriele Paoloni) Synopsys DesignWare host bridge driver: - Make "clocks" and "clock-names" optional DT properties (Bhupesh Sharma) - Use exact access size in dw_pcie_cfg_read() (Gabriele Paoloni) - Simplify dw_pcie_cfg_read/write() interfaces (Gabriele Paoloni) - Require config accesses to be naturally aligned (Gabriele Paoloni) - Make "num-lanes" an optional DT property (Gabriele Paoloni) - Move calculation of bus addresses to DRA7xx (Gabriele Paoloni) - Replace ARM pci_sys_data->align_resource with global function pointer (Gabriele Paoloni) - Factor out MSI msg setup (Lucas Stach) - Implement multivector MSI IRQ setup (Lucas Stach) - Make get_msi_addr() return phys_addr_t, not u32 (Lucas Stach) - Set up high part of MSI target address (Lucas Stach) - Fix PORT_LOGIC_LINK_WIDTH_MASK (Zhou Wang) - Revert "PCI: designware: Program ATU with untranslated address" (Zhou Wang) - Use of_pci_get_host_bridge_resources() to parse DT (Zhou Wang) - Make driver arch-agnostic (Zhou Wang) Miscellaneous: - Make x86 pci_subsys_init() static (Alexander Kuleshov) - Turn off Request Attributes to avoid Chelsio T5 Completion erratum (Hariprasad Shenai)" * tag 'pci-v4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits) PCI: altera: Add Altera PCIe MSI driver PCI: hisi: Add HiSilicon SoC Hip05 PCIe driver PCI: layerscape: Add ls_pcie_msi_host_init() PCI: layerscape: Add support for LS1043a and LS2080a PCI: layerscape: Remove unused fields from struct ls_pcie PCI: layerscape: Update ls_add_pcie_port() PCI: layerscape: Factor out SCFG related function PCI: layerscape: Ignore PCIe controllers in Endpoint mode PCI: layerscape: Remove ls_pcie_establish_link() PCI: designware: Make "clocks" and "clock-names" optional DT properties PCI: designware: Make driver arch-agnostic ARM/PCI: Replace pci_sys_data->align_resource with global function pointer PCI: designware: Use of_pci_get_host_bridge_resources() to parse DT Revert "PCI: designware: Program ATU with untranslated address" PCI: designware: Move calculation of bus addresses to DRA7xx PCI: designware: Make "num-lanes" an optional DT property PCI: designware: Require config accesses to be naturally aligned PCI: designware: Simplify dw_pcie_cfg_read/write() interfaces PCI: designware: Use exact access size in dw_pcie_cfg_read() PCI: spear: Fix dw_pcie_cfg_read/write() usage ...	2015-11-06 11:29:53 -08:00
Linus Torvalds	2f4bf528ec	powerpc updates for 4.4 - Kconfig: remove BE-only platforms from LE kernel build from Boqun Feng - Refresh ps3_defconfig from Geoff Levand - Emit GNU & SysV hashes for the vdso from Michael Ellerman - Define an enum for the bolted SLB indexes from Anshuman Khandual - Use a local to avoid multiple calls to get_slb_shadow() from Michael Ellerman - Add gettimeofday() benchmark from Michael Neuling - Avoid link stack corruption in __get_datapage() from Michael Neuling - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar K.V - Add ppc64le_defconfig from Michael Ellerman - pseries: extract of_helpers module from Andy Shevchenko - Correct string length in pseries_of_derive_parent() from Nathan Fontenot - Free the MSI bitmap if it was slab allocated from Denis Kirjanov - Shorten irq_chip name for the SIU from Christophe Leroy - Wait 1s for secondaries to enter OPAL during kexec from Samuel Mendoza-Jonas - Fix _ALIGN_* errors due to type difference. from Aneesh Kumar K.V - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from Colin Ian King - Disable hugepd for 64K page size. from Aneesh Kumar K.V - Differentiate between hugetlb and THP during page walk from Aneesh Kumar K.V - Make PCI non-optional for pseries from Michael Ellerman - Individual System V IPC system calls from Sam bobroff - Add selftest of unmuxed IPC calls from Michael Ellerman - discard .exit.data at runtime from Stephen Rothwell - Delete old orphaned PrPMC 280/2800 DTS and boot file. from Paul Gortmaker - Use of_get_next_parent to simplify code from Christophe Jaillet - Paginate some xmon output from Sam bobroff - Add some more elements to the xmon PACA dump from Michael Ellerman - Allow the tm-syscall selftest to build with old headers from Michael Ellerman - Run EBB selftests only on POWER8 from Denis Kirjanov - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael Ellerman - Avoid reference to potentially freed memory in prom.c from Christophe Jaillet - Quieten boot wrapper output with run_cmd from Geoff Levand - EEH fixes and cleanups from Gavin Shan - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael Ellerman - Fix section mismatch warning in msi_bitmap_alloc() from Denis Kirjanov - Fix ps3-lpm white space from Rudhresh Kumar J - Fix ps3-vuart null dereference from Colin King - nvram: Add missing kfree in error path from Christophe Jaillet - nvram: Fix function name in some errors messages. from Christophe Jaillet - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro Koskinen - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov - cxl: Free virtual PHB when removing from Andrew Donnellan - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from Michael Ellerman - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building with O= from Michael Ellerman - Freescale updates from Scott: Highlights include 64-bit book3e kexec/kdump support, a rework of the qoriq clock driver, device tree changes including qoriq fman nodes, support for a new 85xx board, and some fixes. - MPC5xxx updates from Anatolij: Highlights include a driver for MPC512x LocalPlus Bus FIFO with its device tree binding documentation, mpc512x device tree updates and some minor fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWPEZgAAoJEFHr6jzI4aWANjYQAKX2Q/95hqKfCuF5FBcUmtMC Pu/Nff027MVzxZ2ApDcvvLGps5Nz2bn3nIhc9zjkXc5E8DuL6X3Yl8ce7qyNcc3g cJJ8RvtUo6J1OMWetXFehtPYniAAwKMhZYKnj0+WnLr2SyH/Vhl3ehDkFbGyPtuH r+2E7krFjfVgU+bzciIFnOaDekFuFN/pXWMb6e6zQyBJe9N8ZIp96uouGCebKVd0 VDLItzdaKErT8JFfbymMPvZm3V0rMVx4WWu3kAbQX8LrD5a18NF1zrjAOHRXc61n kkk8/DPuNOon1PbXXyiS5BcFyZRe+KE3VBnoW5sOMqMIRg5WdO1oU3e2pEfXMO8+ leXYwFLXiKzUZuOgQG2QiUhrzD2yC1o6/TJWATv0dSl9AwrecgPX+Vj6X357slAf A9E3eMy5tgnpndBWZmvZS3W7YDKH+NkeZ+Q40+NErAlqr++ErrTcKVndk5vWlYTT 7mMZeTXagX66al/k5ATKqwB7iUSpnYHSAa9fcUYPSM2FnXsDxPyeJGkBbcoOmkGj QrpgNYOvJaUJd076goZCV39v0c1xpfV9/9kyVch8HUadf6JcjpVZwYnbGw2qlJjh ZanuBG2VOeSwaKQqXiRBSBetnpAg8CVpFjDmX9wOBfSek2wxEJqDX/vQExdbIDQQ pUs7vnUxLzhmW/x+ygOI =YwcM -----END PGP SIGNATURE----- Merge tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Kconfig: remove BE-only platforms from LE kernel build from Boqun Feng - Refresh ps3_defconfig from Geoff Levand - Emit GNU & SysV hashes for the vdso from Michael Ellerman - Define an enum for the bolted SLB indexes from Anshuman Khandual - Use a local to avoid multiple calls to get_slb_shadow() from Michael Ellerman - Add gettimeofday() benchmark from Michael Neuling - Avoid link stack corruption in __get_datapage() from Michael Neuling - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar K.V - Add ppc64le_defconfig from Michael Ellerman - pseries: extract of_helpers module from Andy Shevchenko - Correct string length in pseries_of_derive_parent() from Nathan Fontenot - Free the MSI bitmap if it was slab allocated from Denis Kirjanov - Shorten irq_chip name for the SIU from Christophe Leroy - Wait 1s for secondaries to enter OPAL during kexec from Samuel Mendoza-Jonas - Fix _ALIGN_* errors due to type difference, from Aneesh Kumar K.V - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from Colin Ian King - Disable hugepd for 64K page size, from Aneesh Kumar K.V - Differentiate between hugetlb and THP during page walk from Aneesh Kumar K.V - Make PCI non-optional for pseries from Michael Ellerman - Individual System V IPC system calls from Sam bobroff - Add selftest of unmuxed IPC calls from Michael Ellerman - discard .exit.data at runtime from Stephen Rothwell - Delete old orphaned PrPMC 280/2800 DTS and boot file, from Paul Gortmaker - Use of_get_next_parent to simplify code from Christophe Jaillet - Paginate some xmon output from Sam bobroff - Add some more elements to the xmon PACA dump from Michael Ellerman - Allow the tm-syscall selftest to build with old headers from Michael Ellerman - Run EBB selftests only on POWER8 from Denis Kirjanov - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael Ellerman - Avoid reference to potentially freed memory in prom.c from Christophe Jaillet - Quieten boot wrapper output with run_cmd from Geoff Levand - EEH fixes and cleanups from Gavin Shan - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael Ellerman - Fix section mismatch warning in msi_bitmap_alloc() from Denis Kirjanov - Fix ps3-lpm white space from Rudhresh Kumar J - Fix ps3-vuart null dereference from Colin King - nvram: Add missing kfree in error path from Christophe Jaillet - nvram: Fix function name in some errors messages, from Christophe Jaillet - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro Koskinen - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov - cxl: Free virtual PHB when removing from Andrew Donnellan - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from Michael Ellerman - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building with O= from Michael Ellerman - Freescale updates from Scott: Highlights include 64-bit book3e kexec/kdump support, a rework of the qoriq clock driver, device tree changes including qoriq fman nodes, support for a new 85xx board, and some fixes. - MPC5xxx updates from Anatolij: Highlights include a driver for MPC512x LocalPlus Bus FIFO with its device tree binding documentation, mpc512x device tree updates and some minor fixes. * tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (106 commits) powerpc/msi: Fix section mismatch warning in msi_bitmap_alloc() powerpc/prom: Use of_get_next_parent() in of_get_ibm_chip_id() powerpc/pseries: Correct string length in pseries_of_derive_parent() powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s) powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan powerpc/fsl: Add #clock-cells and clockgen label to clockgen nodes powerpc: handle error case in cpm_muram_alloc() powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake powerpc/book3e-64: Enable kexec powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32 powerpc/book3e-64/kexec: Enable SMP release powerpc/book3e-64/kexec: create an identity TLB mapping powerpc/book3e-64: Don't limit paca to 256 MiB powerpc/book3e/kdump: Enable crash_kexec_wait_realmode powerpc/book3e: support CONFIG_RELOCATABLE powerpc/booke64: Fix args to copy_and_flush powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts powerpc/e6500: kexec: Handle hardware threads ...	2015-11-05 23:38:43 -08:00
Linus Torvalds	2e3078af2c	Merge branch 'akpm' (patches from Andrew) Merge patch-bomb from Andrew Morton: - inotify tweaks - some ocfs2 updates (many more are awaiting review) - various misc bits - kernel/watchdog.c updates - Some of mm. I have a huge number of MM patches this time and quite a lot of it is quite difficult and much will be held over to next time. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits) selftests: vm: add tests for lock on fault mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage mm: introduce VM_LOCKONFAULT mm: mlock: add new mlock system call mm: mlock: refactor mlock, munlock, and munlockall code kasan: always taint kernel on report mm, slub, kasan: enable user tracking by default with KASAN=y kasan: use IS_ALIGNED in memory_is_poisoned_8() kasan: Fix a type conversion error lib: test_kasan: add some testcases kasan: update reference to kasan prototype repo kasan: move KASAN_SANITIZE in arch/x86/boot/Makefile kasan: various fixes in documentation kasan: update log messages kasan: accurately determine the type of the bad access kasan: update reported bug types for kernel memory accesses kasan: update reported bug types for not user nor kernel memory accesses mm/kasan: prevent deadlock in kasan reporting mm/kasan: don't use kasan shadow pointer in generic functions mm/kasan: MODULE_VADDR is not available on all archs ...	2015-11-05 23:10:54 -08:00
Paul Mackerras	f74f2e2e26	KVM: PPC: Book3S HV: Don't dynamically split core when already split In static micro-threading modes, the dynamic micro-threading code is supposed to be disabled, because subcores can't make independent decisions about what micro-threading mode to put the core in - there is only one micro-threading mode for the whole core. The code that implements dynamic micro-threading checks for this, except that the check was missed in one case. This means that it is possible for a subcore in static 2-way micro-threading mode to try to put the core into 4-way micro-threading mode, which usually leads to stuck CPUs, spinlock lockups, and other stalls in the host. The problem was in the can_split_piggybacked_subcores() function, which should always return false if the system is in a static micro-threading mode. This fixes the problem by making can_split_piggybacked_subcores() use subcore_config_ok() for its checks, as subcore_config_ok() includes the necessary check for the static micro-threading modes. Credit to Gautham Shenoy for working out that the reason for the hangs and stalls we were seeing was that we were trying to do dynamic 4-way micro-threading while we were in static 2-way mode. Fixes: `b4deba5c41` Cc: vger@stable.kernel.org # v4.3 Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-11-06 16:02:59 +11:00
Paul Mackerras	cf29b21595	KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails When handling a hypervisor data or instruction storage interrupt (HDSI or HISI), we look up the SLB entry for the address being accessed in order to translate the effective address to a virtual address which can be looked up in the guest HPT. This lookup can occasionally fail due to the guest replacing an SLB entry without invalidating the evicted SLB entry. In this situation an ERAT (effective to real address translation cache) entry can persist and be used by the hardware even though there is no longer a corresponding SLB entry. Previously we would just deliver a data or instruction storage interrupt (DSI or ISI) to the guest in this case. However, this is not correct and has been observed to cause guests to crash, typically with a data storage protection interrupt on a store to the vmemmap area. Instead, what we do now is to synthesize a data or instruction segment interrupt. That should cause the guest to reload an appropriate entry into the SLB and retry the faulting instruction. If it still faults, we should find an appropriate SLB entry next time and be able to handle the fault. Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-11-06 15:40:42 +11:00
Eric B Munson	b0f205c2a3	mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage The previous patch introduced a flag that specified pages in a VMA should be placed on the unevictable LRU, but they should not be made present when the area is created. This patch adds the ability to set this state via the new mlock system calls. We add MLOCK_ONFAULT for mlock2 and MCL_ONFAULT for mlockall. MLOCK_ONFAULT will set the VM_LOCKONFAULT modifier for VM_LOCKED. MCL_ONFAULT should be used as a modifier to the two other mlockall flags. When used with MCL_CURRENT, all current mappings will be marked with VM_LOCKED \| VM_LOCKONFAULT. When used with MCL_FUTURE, the mm->def_flags will be marked with VM_LOCKED \| VM_LOCKONFAULT. When used with both MCL_CURRENT and MCL_FUTURE, all current mappings and mm->def_flags will be marked with VM_LOCKED \| VM_LOCKONFAULT. Prior to this patch, mlockall() will unconditionally clear the mm->def_flags any time it is called without MCL_FUTURE. This behavior is maintained after adding MCL_ONFAULT. If a call to mlockall(MCL_FUTURE) is followed by mlockall(MCL_CURRENT), the mm->def_flags will be cleared and new VMAs will be unlocked. This remains true with or without MCL_ONFAULT in either mlockall() invocation. munlock() will unconditionally clear both vma flags. munlockall() unconditionally clears for VMA flags on all VMAs and in the mm->def_flags field. Signed-off-by: Eric B Munson <emunson@akamai.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-05 19:34:48 -08:00
Raghavendra K T	c118baf802	arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing nodes With the setup_nr_nodes(), we have already initialized node_possible_map. So it is safe to use for_each_node here. There are many places in the kernel that use hardcoded 'for' loop with nr_node_ids, because all other architectures have numa nodes populated serially. That should be reason we had maintained the same for powerpc. But, since sparse numa node ids possible on powerpc, we unnecessarily allocate memory for non existent numa nodes. For e.g., on a system with 0,1,16,17 as numa nodes nr_node_ids=18 and we allocate memory for nodes 2-14. This patch we allocate memory for only existing numa nodes. The patch is boot tested on a 4 node tuleta, confirming with printks that it works as expected. Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anton Blanchard <anton@samba.org> Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-05 19:34:48 -08:00
Andrew Morton	0ab32b6f1b	uaccess: reimplement probe_kernel_address() using probe_kernel_read() probe_kernel_address() is basically the same as the (later added) probe_kernel_read(). The return value on EFAULT is a bit different: probe_kernel_address() returns number-of-bytes-not-copied whereas probe_kernel_read() returns -EFAULT. All callers have been checked, none cared. probe_kernel_read() can be overridden by the architecture whereas probe_kernel_address() cannot. parisc, blackfin and um do this, to insert additional checking. Hence this patch possibly fixes obscure bugs, although there are only two probe_kernel_address() callsites outside arch/. My first attempt involved removing probe_kernel_address() entirely and converting all callsites to use probe_kernel_read() directly, but that got tiresome. This patch shrinks mm/slab_common.o by 218 bytes. For a single probe_kernel_address() callsite. Cc: Steven Miao <realmz6@gmail.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-05 19:34:48 -08:00
Linus Torvalds	933425fb00	s390: A bunch of fixes and optimizations for interrupt and time handling. PPC: Mostly bug fixes. ARM: No big features, but many small fixes and prerequisites including: - a number of fixes for the arch-timer - introducing proper level-triggered semantics for the arch-timers - a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding) - some tracepoint improvements - a tweak for the EL2 panic handlers - some more VGIC cleanups getting rid of redundant state x86: quite a few changes: - support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). This introduces a new component (in virt/lib/) that connects VFIO and KVM together. The same infrastructure will be used for ARM interrupt forwarding as well. - more Hyper-V features, though the main one Hyper-V synthetic interrupt controller will have to wait for 4.5. These will let KVM expose Hyper-V devices. - nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster - for future hardware that supports NVDIMM, there is support for clflushopt, clwb, pcommit - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor - obligatory smattering of SMM fixes - on the guest side, stable scheduler clock support was rewritten to not require help from the hypervisor. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWO2IQAAoJEL/70l94x66D/K0H/3AovAgYmJQToZlimsktMk6a f2xhdIqfU5lIQQh5uNBCfL3o9o8H9Py1ym7aEw3fmztPHHJYc91oTatt2UEKhmEw VtZHp/dFHt3hwaIdXmjRPEXiYctraKCyrhaUYdWmUYkoKi7lW5OL5h+S7frG2U6u p/hFKnHRZfXHr6NSgIqvYkKqtnc+C0FWY696IZMzgCksOO8jB1xrxoSN3tANW3oJ PDV+4og0fN/Fr1capJUFEc/fejREHneANvlKrLaa8ht0qJQutoczNADUiSFLcMPG iHljXeDsv5eyjMtUuIL8+MPzcrIt/y4rY41ZPiKggxULrXc6H+JJL/e/zThZpXc= =iv2z -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "First batch of KVM changes for 4.4. s390: A bunch of fixes and optimizations for interrupt and time handling. PPC: Mostly bug fixes. ARM: No big features, but many small fixes and prerequisites including: - a number of fixes for the arch-timer - introducing proper level-triggered semantics for the arch-timers - a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding) - some tracepoint improvements - a tweak for the EL2 panic handlers - some more VGIC cleanups getting rid of redundant state x86: Quite a few changes: - support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). This introduces a new component (in virt/lib/) that connects VFIO and KVM together. The same infrastructure will be used for ARM interrupt forwarding as well. - more Hyper-V features, though the main one Hyper-V synthetic interrupt controller will have to wait for 4.5. These will let KVM expose Hyper-V devices. - nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster - for future hardware that supports NVDIMM, there is support for clflushopt, clwb, pcommit - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor - obligatory smattering of SMM fixes - on the guest side, stable scheduler clock support was rewritten to not require help from the hypervisor" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits) KVM: VMX: Fix commit which broke PML KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0() KVM: x86: allow RSM from 64-bit mode KVM: VMX: fix SMEP and SMAP without EPT KVM: x86: move kvm_set_irq_inatomic to legacy device assignment KVM: device assignment: remove pointless #ifdefs KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic KVM: x86: zero apic_arb_prio on reset drivers/hv: share Hyper-V SynIC constants with userspace KVM: x86: handle SMBASE as physical address in RSM KVM: x86: add read_phys to x86_emulate_ops KVM: x86: removing unused variable KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr() KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings KVM: arm/arm64: Optimize away redundant LR tracking KVM: s390: use simple switch statement as multiplexer KVM: s390: drop useless newline in debugging data KVM: s390: SCA must not cross page boundaries KVM: arm: Do not indent the arguments of DECLARE_BITMAP ...	2015-11-05 16:26:26 -08:00
Linus Torvalds	1873499e13	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem update from James Morris: "This is mostly maintenance updates across the subsystem, with a notable update for TPM 2.0, and addition of Jarkko Sakkinen as a maintainer of that" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (40 commits) apparmor: clarify CRYPTO dependency selinux: Use a kmem_cache for allocation struct file_security_struct selinux: ioctl_has_perm should be static selinux: use sprintf return value selinux: use kstrdup() in security_get_bools() selinux: use kmemdup in security_sid_to_context_core() selinux: remove pointless cast in selinux_inode_setsecurity() selinux: introduce security_context_str_to_sid selinux: do not check open perm on ftruncate call selinux: change CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default KEYS: Merge the type-specific data with the payload data KEYS: Provide a script to extract a module signature KEYS: Provide a script to extract the sys cert list from a vmlinux file keys: Be more consistent in selection of union members used certs: add .gitignore to stop git nagging about x509_certificate_list KEYS: use kvfree() in add_key Smack: limited capability for changing process label TPM: remove unnecessary little endian conversion vTPM: support little endian guests char: Drop owner assignment from i2c_driver ...	2015-11-05 15:32:38 -08:00
Michael Ellerman	8bdf2023e2	Merge branch 'next' of git://git.denx.de/linux-denx-agust into next MPC5xxx updates from Anatolij: "Highlights include a driver for MPC512x LocalPlus Bus FIFO with its device tree binding documentation, mpc512x device tree updates and some minor fixes."	2015-11-05 19:35:12 +11:00
Linus Torvalds	b0f85fa11a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: Changes of note: 1) Allow to schedule ICMP packets in IPVS, from Alex Gartrell. 2) Provide FIB table ID in ipv4 route dumps just as ipv6 does, from David Ahern. 3) Allow the user to ask for the statistics to be filtered out of ipv4/ipv6 address netlink dumps. From Sowmini Varadhan. 4) More work to pass the network namespace context around deep into various packet path APIs, starting with the netfilter hooks. From Eric W Biederman. 5) Add layer 2 TX/RX checksum offloading to qeth driver, from Thomas Richter. 6) Use usec resolution for SYN/ACK RTTs in TCP, from Yuchung Cheng. 7) Support Very High Throughput in wireless MESH code, from Bob Copeland. 8) Allow setting the ageing_time in switchdev/rocker. From Scott Feldman. 9) Properly autoload L2TP type modules, from Stephen Hemminger. 10) Fix and enable offload features by default in 8139cp driver, from David Woodhouse. 11) Support both ipv4 and ipv6 sockets in a single vxlan device, from Jiri Benc. 12) Fix CWND limiting of thin streams in TCP, from Bendik Rønning Opstad. 13) Fix IPSEC flowcache overflows on large systems, from Steffen Klassert. 14) Convert bridging to track VLANs using rhashtable entries rather than a bitmap. From Nikolay Aleksandrov. 15) Make TCP listener handling completely lockless, this is a major accomplishment. Incoming request sockets now live in the established hash table just like any other socket too. From Eric Dumazet. 15) Provide more bridging attributes to netlink, from Nikolay Aleksandrov. 16) Use hash based algorithm for ipv4 multipath routing, this was very long overdue. From Peter Nørlund. 17) Several y2038 cures, mostly avoiding timespec. From Arnd Bergmann. 18) Allow non-root execution of EBPF programs, from Alexei Starovoitov. 19) Support SO_INCOMING_CPU as setsockopt, from Eric Dumazet. This influences the port binding selection logic used by SO_REUSEPORT. 20) Add ipv6 support to VRF, from David Ahern. 21) Add support for Mellanox Spectrum switch ASIC, from Jiri Pirko. 22) Add rtl8xxxu Realtek wireless driver, from Jes Sorensen. 23) Implement RACK loss recovery in TCP, from Yuchung Cheng. 24) Support multipath routes in MPLS, from Roopa Prabhu. 25) Fix POLLOUT notification for listening sockets in AF_UNIX, from Eric Dumazet. 26) Add new QED Qlogic river, from Yuval Mintz, Manish Chopra, and Sudarsana Kalluru. 27) Don't fetch timestamps on AF_UNIX sockets, from Hannes Frederic Sowa. 28) Support ipv6 geneve tunnels, from John W Linville. 29) Add flood control support to switchdev layer, from Ido Schimmel. 30) Fix CHECKSUM_PARTIAL handling of potentially fragmented frames, from Hannes Frederic Sowa. 31) Support persistent maps and progs in bpf, from Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1790 commits) sh_eth: use DMA barriers switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursion net: sched: kill dead code in sch_choke.c irda: Delete an unnecessary check before the function call "irlmp_unregister_service" net: dsa: mv88e6xxx: include DSA ports in VLANs net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports net/core: fix for_each_netdev_feature vlan: Invoke driver vlan hooks only if device is present arcnet/com20020: add LEDS_CLASS dependency bpf, verifier: annotate verbose printer with __printf dp83640: Only wait for timestamps for packets with timestamping enabled. ptp: Change ptp_class to a proper bitmask dp83640: Prune rx timestamp list before reading from it dp83640: Delay scheduled work. dp83640: Include hash in timestamp/packet matching ipv6: fix tunnel error handling net/mlx5e: Fix LSO vlan insertion net/mlx5e: Re-eanble client vlan TX acceleration net/mlx5e: Return error in case mlx5e_set_features() fails net/mlx5e: Don't allow more than max supported channels ...	2015-11-04 09:41:05 -08:00
Paolo Bonzini	197a4f4b06	KVM/ARM Changes for v4.4-rc1 Includes a number of fixes for the arch-timer, introducing proper level-triggered semantics for the arch-timers, a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding), some tracepoint improvements, a tweak for the EL2 panic handlers, some more VGIC cleanups getting rid of redundant state, and finally a stylistic change that gets rid of some ctags warnings. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWOhgAAAoJEEtpOizt6ddyKS8H/2ZHMTPjo6yChnrusNWy4Qbr 6laPDlzL+g45oMQRwNL7GnM1deRftaxvT2Wi+X84D/6Y/BD6MPds4HgtBfuWcSZ1 CyRJ0Ot/zrxenucSuJuOjq+a9gdizdAczkbB1MfYDULJH8fb6D+7RYLo3zgh4Xo4 pla3L9U6gSWe+YopBjZtZH43m3fwiwSM/v+uHOTIcXrsbR+fEgx/EFSKmA/DUCuo P5cFO/ceUGu7nATCexu5V82TgR2hvurrsR7mqfwY8YcF6HRM+NEOoS29xWC77v5S u/F08TKuKQLv0YTEFTyLETI/oEeuC0cHtrRQBNf4+9kXEOzKyXaae0wR/I6X2Ss= =GMNk -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Changes for v4.4-rc1 Includes a number of fixes for the arch-timer, introducing proper level-triggered semantics for the arch-timers, a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding), some tracepoint improvements, a tweak for the EL2 panic handlers, some more VGIC cleanups getting rid of redundant state, and finally a stylistic change that gets rid of some ctags warnings. Conflicts: arch/x86/include/asm/kvm_host.h	2015-11-04 16:24:17 +01:00
Linus Torvalds	b02ac6b18c	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side changes: - Improve accuracy of perf/sched clock on x86. (Adrian Hunter) - Intel DS and BTS updates. (Alexander Shishkin) - Intel cstate PMU support. (Kan Liang) - Add group read support to perf_event_read(). (Peter Zijlstra) - Branch call hardware sampling support, implemented on x86 and PowerPC. (Stephane Eranian) - Event groups transactional interface enhancements. (Sukadev Bhattiprolu) - Enable proper x86/intel/uncore PMU support on multi-segment PCI systems. (Taku Izumi) - ... misc fixes and cleanups. The perf tooling team was very busy again with 200+ commits, the full diff doesn't fit into lkml size limits. Here's an (incomplete) list of the tooling highlights: New features: - Change the default event used in all tools (record/top): use the most precise "cycles" hw counter available, i.e. when the user doesn't specify any event, it will try using cycles:ppp, cycles:pp, etc and fall back transparently until it finds a working counter. (Arnaldo Carvalho de Melo) - Integration of perf with eBPF that, given an eBPF .c source file (or .o file built for the 'bpf' target with clang), will get it automatically built, validated and loaded into the kernel via the sys_bpf syscall, which can then be used and seen using 'perf trace' and other tools. (Wang Nan) Various user interface improvements: - Automatic pager invocation on long help output. (Namhyung Kim) - Search for more options when passing args to -h, e.g.: (Arnaldo Carvalho de Melo) $ perf report -h interface Usage: perf report [<options>] --gtk Use the GTK2 interface --stdio Use the stdio interface --tui Use the TUI interface - Show ordered command line options when -h is used or when an unknown option is specified. (Arnaldo Carvalho de Melo) - If options are passed after -h, show just its descriptions, not all options. (Arnaldo Carvalho de Melo) - Implement column based horizontal scrolling in the hists browser (top, report), making it possible to use the TUI for things like 'perf mem report' where there are many more columns than can fit in a terminal. (Arnaldo Carvalho de Melo) - Enhance the error reporting of tracepoint event parsing, e.g.: $ oldperf record -e sched:sched_switc usleep 1 event syntax error: 'sched:sched_switc' \___ unknown tracepoint Run 'perf list' for a list of valid events Now we get the much nicer: $ perf record -e sched:sched_switc ls event syntax error: 'sched:sched_switc' \___ can't access trace events Error: No permissions to read /sys/kernel/debug/tracing/events/sched/sched_switc Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug' And after we have those mount point permissions fixed: $ perf record -e sched:sched_switc ls event syntax error: 'sched:sched_switc' \___ unknown tracepoint Error: File /sys/kernel/debug/tracing/events/sched/sched_switc not found. Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?. I.e. basically now the event parsing routing uses the strerror_open() routines introduced by and used in 'perf trace' work. (Jiri Olsa) - Fail properly when pattern matching fails to find a tracepoint, i.e. '-e non:existent' was being correctly handled, with a proper error message about that not being a valid event, but '-e non:existent' wasn't, fix it. (Jiri Olsa) - Do event name substring search as last resort in 'perf list'. (Arnaldo Carvalho de Melo) E.g.: # perf list clock List of pre-defined events (to be used in -e): cpu-clock [Software event] task-clock [Software event] uncore_cbox_0/clockticks/ [Kernel PMU event] uncore_cbox_1/clockticks/ [Kernel PMU event] kvm:kvm_pvclock_update [Tracepoint event] kvm:kvm_update_master_clock [Tracepoint event] power:clock_disable [Tracepoint event] power:clock_enable [Tracepoint event] power:clock_set_rate [Tracepoint event] syscalls:sys_enter_clock_adjtime [Tracepoint event] syscalls:sys_enter_clock_getres [Tracepoint event] syscalls:sys_enter_clock_gettime [Tracepoint event] syscalls:sys_enter_clock_nanosleep [Tracepoint event] syscalls:sys_enter_clock_settime [Tracepoint event] syscalls:sys_exit_clock_adjtime [Tracepoint event] syscalls:sys_exit_clock_getres [Tracepoint event] syscalls:sys_exit_clock_gettime [Tracepoint event] syscalls:sys_exit_clock_nanosleep [Tracepoint event] syscalls:sys_exit_clock_settime [Tracepoint event] Intel PT hardware tracing enhancements: - Accept a zero --itrace period, meaning "as often as possible". In the case of Intel PT that is the same as a period of 1 and a unit of 'instructions' (i.e. --itrace=i1i). (Adrian Hunter) - Harmonize itrace's synthesized callchains with the existing --max-stack tool option. (Adrian Hunter) - Allow time to be displayed in nanoseconds in 'perf script'. (Adrian Hunter) - Fix potential infinite loop when handling Intel PT timestamps. (Adrian Hunter) - Slighly improve Intel PT debug logging. (Adrian Hunter) - Warn when AUX data has been lost, just like when processing PERF_RECORD_LOST. (Adrian Hunter) - Further document export-to-postgresql.py script. (Adrian Hunter) - Add option to synthesize branch stack from auxtrace data. (Adrian Hunter) Misc notable changes: - Switch the default callchain output mode to 'graph,0.5,caller', to make it look like the default for other tools, reducing the learning curve for people used to 'caller' based viewing. (Arnaldo Carvalho de Melo) - various call chain usability enhancements. (Namhyung Kim) - Introduce the 'P' event modifier, meaning 'max precision level, please', i.e.: $ perf record -e cycles:P usleep 1 Is now similar to: $ perf record usleep 1 Useful, for instance, when specifying multiple events. (Jiri Olsa) - Add 'socket' sort entry, to sort by the processor socket in 'perf top' and 'perf report'. (Kan Liang) - Introduce --socket-filter to 'perf report', for filtering by processor socket. (Kan Liang) - Add new "Zoom into Processor Socket" operation in the perf hists browser, used in 'perf top' and 'perf report'. (Kan Liang) - Allow probing on kmodules without DWARF. (Masami Hiramatsu) - Fix 'perf probe -l' for probes added to kernel module functions. (Masami Hiramatsu) - Preparatory work for the 'perf stat record' feature that will allow generating perf.data files with counting data in addition to the sampling mode we have now (Jiri Olsa) - Update libtraceevent KVM plugin. (Paolo Bonzini) - ... plus lots of other enhancements that I failed to list properly, by: Adrian Hunter, Alexander Shishkin, Andi Kleen, Andrzej Hajda, Arnaldo Carvalho de Melo, Dima Kogan, Don Zickus, Geliang Tang, He Kuang, Huaitong Han, Ingo Molnar, Jan Stancek, Jiri Olsa, Kan Liang, Kirill Tkhai, Masami Hiramatsu, Matt Fleming, Namhyung Kim, Paolo Bonzini, Peter Zijlstra, Rabin Vincent, Scott Wood, Stephane Eranian, Sukadev Bhattiprolu, Taku Izumi, Vaishali Thakkar, Wang Nan, Yang Shi and Yunlong Song" 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (260 commits) perf unwind: Pass symbol source to libunwind tools build: Fix libiberty feature detection perf tools: Compile scriptlets to BPF objects when passing '.c' to --event perf record: Add clang options for compiling BPF scripts perf bpf: Attach eBPF filter to perf event perf tools: Make sure fixdep is built before libbpf perf script: Enable printing of branch stack perf trace: Add cmd string table to decode sys_bpf first arg perf bpf: Collect perf_evsel in BPF object files perf tools: Load eBPF object into kernel perf tools: Create probe points for BPF programs perf tools: Enable passing bpf object file to --event perf ebpf: Add the libbpf glue perf tools: Make perf depend on libbpf perf symbols: Fix endless loop in dso__split_kallsyms_for_kcore perf tools: Enable pre-event inherit setting by config terms perf symbols: we can now read separate debug-info files based on a build ID perf symbols: Fix type error when reading a build-id perf tools: Search for more options when passing args to -h perf stat: Cache aggregated map entries in extra cpumap ...	2015-11-03 17:38:09 -08:00
Linus Torvalds	6aa2fdb87c	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The irq departement delivers: - Rework the irqdomain core infrastructure to accomodate ACPI based systems. This is required to support ARM64 without creating artificial device tree nodes. - Sanitize the ACPI based ARM GIC initialization by making use of the new firmware independent irqdomain core - Further improvements to the generic MSI management - Generalize the irq migration on CPU hotplug - Improvements to the threaded interrupt infrastructure - Allow the migration of "chained" low level interrupt handlers - Allow optional force masking of interrupts in disable_irq[_nosysnc] - Support for two new interrupt chips - Sigh! - A larger set of errata fixes for ARM gicv3 - The usual pile of fixes, updates, improvements and cleanups all over the place" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) Document that IRQ_NONE should be returned when IRQ not actually handled PCI/MSI: Allow the MSI domain to be device-specific PCI: Add per-device MSI domain hook of/irq: Use the msi-map property to provide device-specific MSI domain of/irq: Split of_msi_map_rid to reuse msi-map lookup irqchip/gic-v3-its: Parse new version of msi-parent property PCI/MSI: Use of_msi_get_domain instead of open-coded "msi-parent" parsing of/irq: Use of_msi_get_domain instead of open-coded "msi-parent" parsing of/irq: Add support code for multi-parent version of "msi-parent" irqchip/gic-v3-its: Add handling of PCI requester id. PCI/MSI: Add helper function pci_msi_domain_get_msi_rid(). of/irq: Add new function of_msi_map_rid() Docs: dt: Add PCI MSI map bindings irqchip/gic-v2m: Add support for multiple MSI frames irqchip/gic-v3: Fix translation of LPIs after conversion to irq_fwspec irqchip/mxs: Add Alphascale ASM9260 support irqchip/mxs: Prepare driver for hardware with different offsets irqchip/mxs: Panic if ioremap or domain creation fails irqdomain: Documentation updates irqdomain/msi: Use fwnode instead of of_node ...	2015-11-03 14:40:01 -08:00
Michael Ellerman	3b0e21ec3b	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include 64-bit book3e kexec/kdump support, a rework of the qoriq clock driver, device tree changes including qoriq fman nodes, support for a new 85xx board, and some fixes. Note that there is a trivial merge conflict with the clock tree's next branch, in the clock Makefile."	2015-11-02 13:59:48 +11:00
David S. Miller	b75ec3af27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2015-11-01 00:15:30 -04:00
Linus Torvalds	8a28d67457	powerpc fixes for 4.3 #5 - powerpc/dma: dma_set_coherent_mask() should not be GPL only from Ben -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWMFthAAoJEFHr6jzI4aWAZFkP/1Ci8cgh0GIfdN3wafi85pIe dhSBicWpHPpwnvcaD+ab/Q73/nfub6NKTpMasBankf53SYUWmPptwu+Gs0lv7GVv Q1AjOEP0vhcBVxhXNJpOWDGFyt5wez4BevqWpmPuz9MEy7PCpgevQJjPEsG8qnzh BfUvDkWAOjA6jX3mTyZQyefZECGPhDsM31eGYvkFQRB6Ui1KKbYbFMncjjTtBfjW fiUnKShJZIcVtpImGg6WQRKjRaT7uJzmFOLUclv/7dDoT8hLztiiLP44zX+kvJtF AyCBSQI/LOrQE826YpF93Vq6WZVvzNQZaPIHPC2pj1WIF6Q/gTTng5T16jODX+QI ldQ96sLr9WA49UaS6JakA7iv/OhRXUlRO6qyow8I+1XQdoY5yuVyXct9NpQ3A9J6 lOw7d8o4RuMaGqMx5nUcyDrv/NICYuuyaikDe+8GY4Df6FG6Aepit1WO6BcDkrfC FrnQ+XROmvNoKbVRYYhJZRhEqz1PJJwOmCtNYNzjiJXfgKNknYPBfoBXqx8XeDxl VTyxer8V/wMP50X+SIg558YqNIXMj9ZA7fU5xnN4Y7MbHC5DEBnVX9nmc0J5PjTF t36S+cm1BeG44/LTSrH66DwsnWce0xo0doK7wBx+CKSGXVciu1VPn59upL1UtX5z WnyEnMm5NYc5aTGfZf+L =sFFh -----END PGP SIGNATURE----- Merge tag 'powerpc-4.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - powerpc/dma: dma_set_coherent_mask() should not be GPL only from Ben * tag 'powerpc-4.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/dma: dma_set_coherent_mask() should not be GPL only	2015-10-28 18:59:53 +09:00
Benjamin Herrenschmidt	977bf062bb	powerpc/dma: dma_set_coherent_mask() should not be GPL only When turning this from inline to an exported function I was a bit over-eager and made it GPL only. This prevents the use of pretty much all non-GPL PCI driver which is a bit over the top. Let's bring it back in line with other architecture. Fixes: `817820b022` ("powerpc/iommu: Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-28 14:20:50 +09:00
Denis Kirjanov	ccde64b51b	powerpc/msi: Fix section mismatch warning in msi_bitmap_alloc() Building with CONFIG_DEBUG_SECTION_MISMATCH gives the following warning: The function .msi_bitmap_alloc() references the function __init .memblock_virt_alloc_try_nid(). Memory allocation in msi_bitmap_alloc() uses either slab allocator or memblock boot time allocator depending on slab_is_available(). So the section mismatch warning is correct, but in practice there is no bug so mark msi_bitmap_alloc() as __init_refok. Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> [mpe: Flesh out change log a bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-28 12:08:33 +09:00
Michael Ellerman	16c1d60626	powerpc/prom: Use of_get_next_parent() in of_get_ibm_chip_id() Use of_get_next_parent() to simplifiy the logic in of_get_ibm_chip_id(). Original-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-28 12:08:32 +09:00
Nathan Fontenot	f755ecfb8c	powerpc/pseries: Correct string length in pseries_of_derive_parent() Commit `a030e1e4bb` make a change to use kstrndup() instead of kmalloc() + strlcpy() in the pseries_of_derive_parent() routine that introduces a subtle change in the parent path name generated. The kstrndup() routine will copy n characters followed by a terminating null, whereas strlcpy() will copy n-1 characters and add a terminating null. This slight difference results in having a parent path that includes the tailing '/' character, "/cpus/" vs. "/cpus". This then causes the subsequent call to of_find_node_by_path() to fail, and in the case of DLPAR add operations the DLPAR request fails. This patch decrements the pointer returned from kbasename() to point to the '/' character before the base name instead of the base name. This then adjusts the string length calculations to not include the trailing '/' in the parent path name. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-28 12:08:18 +09:00
Kevin Hao	e1f580e8ce	powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry In order to workaround Erratum A-008139, we have to invalidate the tlb entry with tlbilx before overwriting. Due to the performance consideration, we don't add any memory barrier when acquire/release the tcd lock. This means the two load instructions for esel_next do have the possibility to return different value. This is definitely not acceptable due to the Erratum A-008139. We have two options to fix this issue: a) Add memory barrier when acquire/release tcd lock to order the load/store to esel_next. b) Just make sure to invalidate and write to the same tlb entry and tolerate the race that we may get the wrong value and overwrite the tlb entry just updated by the other thread. We observe better performance using option b. So reserve an additional register to save the value of the esel_next. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:14:40 -05:00
Igal Liberman	da414bb923	powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s) Based on prior work by Andy Fleming <afleming@freescale.com> Signed-off-by: Shruti Kanetkar <Shruti@freescale.com> Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Igal Liberman <Igal.Liberman@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:14:39 -05:00
Igal Liberman	d55ad2967d	powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan Based on prior work by Andy Fleming <afleming@freescale.com> Signed-off-by: Shruti Kanetkar <Shruti@freescale.com> Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Igal Liberman <Igal.Liberman@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:14:39 -05:00
Scott Wood	7a1db41d83	powerpc/fsl: Add #clock-cells and clockgen label to clockgen nodes This allows new-style clock references to be used, which is needed for fman. The old clock nodes will be removed and all clock references converted to new-style once the qoriq-cpufreq driver is updated to stop depending on the old-style references in cpu nodes. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:14:38 -05:00
Scott Wood	43f2cfcce2	Merge branch 'clock' into HEAD This is a major overhaul of the clk-qoriq driver, which I'm merging via PPC with Stephen Boyd's ack in order to apply subsequent PPC patches that depend on it.	2015-10-27 18:14:16 -05:00
Christophe Leroy	9d28cc811b	powerpc: handle error case in cpm_muram_alloc() rh_alloc() returns (unsigned long)-ERRxx on error, which may result in overwriting memory outside the MURAM AREA. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:31 -05:00
Sudeep Holla	9100d20c5b	powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake mpic_irq_set_wake return -ENXIO for non FSL MPIC and sets IRQF_NO_SUSPEND flag for FSL ones. enable_irq_wake already returns -ENXIO if irq_set_wak is not implemented. Also there's no need to set the IRQF_NO_SUSPEND flag as it doesn't guarantee wakeup for that interrupt. This patch removes the redundant mpic_irq_set_wake and sets the IRQCHIP_SKIP_SET_WAKE for only FSL MPIC. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Scott Wood <scottwood@freescale.com> Cc: Hongtao Jia <hongtao.jia@freescale.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:31 -05:00
Tiejun Chen	96eea6426f	powerpc/book3e-64: Enable kexec Allow KEXEC for book3e, and bypass or convert non-book3e stuff in kexec code. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [scottwood@freescale.com: move code to minimize diff, and cleanup] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:30 -05:00
Scott Wood	ae73e4ccbc	powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop book3e_secondary_core_init will only create a TLB entry if r4 = 0, so do so. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:30 -05:00
Scott Wood	ffda09a994	powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32 The way VIRT_PHYS_OFFSET is not correct on book3e-64, because it does not account for CONFIG_RELOCATABLE other than via the 32-bit-only virt_phys_offset. book3e-64 can (and if the comment about a GCC miscompilation is still relevant, should) use the normal ppc64 __va/__pa. At this point, only booke-32 will use VIRT_PHYS_OFFSET, so given the issues with its calculation, restrict its definition to booke-32. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:29 -05:00
Scott Wood	567cf94dc7	powerpc/book3e-64/kexec: Enable SMP release The SMP release mechanism for FSL book3e is different from when booting with normal hardware. In theory we could simulate the normal spin table mechanism, but not at the addresses U-Boot put in the device tree -- so there'd need to be even more communication between the kernel and kexec to set that up. Instead, kexec-tools will set a boolean property linux,booted-from-kexec in the /chosen node. Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: devicetree@vger.kernel.org	2015-10-27 18:13:29 -05:00
Tiejun Chen	cf904e3088	powerpc/book3e-64/kexec: create an identity TLB mapping book3e has no real MMU mode so we have to create an identity TLB mapping to make sure we can access the real physical address. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [scottwood: cleanup, and split off some changes] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:28 -05:00
Scott Wood	ecc4999f68	powerpc/book3e-64: Don't limit paca to 256 MiB This limit only makes sense on book3s, and on book3e it can cause problems with kdump if we don't have any memory under 256 MiB. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:28 -05:00
Scott Wood	eeaab663a0	powerpc/book3e/kdump: Enable crash_kexec_wait_realmode While book3e doesn't have "real mode", we still want to wait for all the non-crash cpus to complete their shutdown. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:27 -05:00
Tiejun Chen	1cb6e06492	powerpc/book3e: support CONFIG_RELOCATABLE book3e is different with book3s since 3s includes the exception vectors code in head_64.S as it relies on absolute addressing which is only possible within this compilation unit. So we have to get that label address with got. And when boot a relocated kernel, we should reset ipvr properly again after .relocate. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [scottwood: cleanup and ifdef removal] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:27 -05:00
Tiejun Chen	835c031c98	powerpc/booke64: Fix args to copy_and_flush Convert r4/r5, not r6, to a virtual address when calling copy_and_flush. Otherwise, r3 is already virtual, and copy_to_flush tries to access r3+r6, PAGE_OFFSET gets added twice. This isn't normally seen because on book3e we normally enter with the kernel at zero and thus skip copy_to_flush -- but it will be needed for kexec support. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [scottwood: split patch and rewrote changelog] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:26 -05:00
Tiejun Chen	68d1014019	powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts Rename 'interrupt_end_book3e' to '__end_interrupts' so that the symbol can be used by both book3s and book3e. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [scottwood: edit changelog] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:26 -05:00
Scott Wood	f34b3e19fd	powerpc/e6500: kexec: Handle hardware threads The new kernel will be expecting secondary threads to be disabled, not spinning. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:25 -05:00
Tiejun Chen	939fbf0080	powerpc/85xx: Implement 64-bit kexec support Unlike 32-bit 85xx kexec, we don't do a core reset. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> [scottwood: edit changelog, and cleanup] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:25 -05:00
Scott Wood	eba5de8dc1	powerpc/fsl-booke-64: Don't limit ppc64_rma_size to one TLB entry This is required for kdump to work when loaded at at an address that does not fall within the first TLB entry -- which can easily happen because while the lower limit is enforced via reserved memory, which doesn't affect how much is mapped, the upper limit is enforced via a different mechanism that does. Thus, more TLB entries are needed than would normally be used, as the total memory to be mapped might not be a power of two. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-27 18:13:22 -05:00
Linus Torvalds	a2c01ed5d4	powerpc fixes for 4.3 #4 - Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8" from Paul - Handle irq_happened flag correctly in off-line loop from Paul - Validate rtas.entry before calling enter_rtas() from Vasant -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWKakXAAoJEFHr6jzI4aWAqwsP/22a8aq7SPuPaCK/c7u84aHg UkzYNdrC+osDvnrydqyJsmIojVLX+AsJ7TCbBZFaJ6oOuew9bVzmZ5mfNvOVn86H dCH2GMr4NbWbkO3SaFi6ZTUVGl7JyjEhf3uCtKGssa2+Do8FubK6Y88L1rhzFFdz l1Dx3Jp8CpGKcByQfwYyaNZhC/GEZ06pY36d362mLnyctxcQRYr5l+8boDH81nyE f89RE7baNPYOL0YOhZAh3ZilBrZ8DIAaesMXU8LUKFbLTBgWfVPkDy3l5a2m47oP V/Yi+oEQBkL/3Itth57iGWpl8vVkzF2MTu8Aep3BzHJXqXCliTzVVdXETW6NCdut Nss0xtnNdM18+0mhG3LzzdoZGi/Zb0SYz8j+nY5vE2nf7FDVFkAZzKHeW822zNaV A1PLJa+ei4jVhKtTp4wETjpUi+kw+ikM+rR1L1/+IKHbriLsRrj7Zw3xo6Em1KVq cI2g7DZLSzptIprxbEv9rNhb1VlBot4jc4mmGhmyMlwKDkpCxRkYVv+Ilfi6jCSc 6llKTZfKqEV+0sXO6QISv8wfiye84jVTKOlkpQLvpugz9rBTpq8apmInVh4AHF2b wDRgs/iyOSZuz+UiPEHHXbW7ZfF2F7lqxxtQgJefiWLDsvBbsfnVTyDJsKibvWzb lxorlKx/tH/q4pNBjmoB =K7eA -----END PGP SIGNATURE----- Merge tag 'powerpc-4.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8" from Paul - Handle irq_happened flag correctly in off-line loop from Paul - Validate rtas.entry before calling enter_rtas() from Vasant * tag 'powerpc-4.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/rtas: Validate rtas.entry before calling enter_rtas() powerpc/powernv: Handle irq_happened flag correctly in off-line loop powerpc: Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8"	2015-10-23 18:49:51 +09:00
Scott Wood	d9e1831a42	powerpc/85xx: Load all early TLB entries at once Use an AS=1 trampoline TLB entry to allow all normal TLB1 entries to be loaded at once. This avoids the need to keep the translation that code is executing from in the same TLB entry in the final TLB configuration as during early boot, which in turn is helpful for relocatable kernels (e.g. kdump) where the kernel is not running from what would be the first TLB entry. On e6500, we limit map_mem_in_cams() to the primary hwthread of a core (the boot cpu is always considered primary, as a kdump kernel can be entered on any cpu). Each TLB only needs to be set up once, and when we do, we don't want another thread to be running when we create a temporary trampoline TLB1 entry. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-22 22:50:46 -05:00
Christoffer Dall	3217f7c25b	KVM: Add kvm_arch_vcpu_{un}blocking callbacks Some times it is useful for architecture implementations of KVM to know when the VCPU thread is about to block or when it comes back from blocking (arm/arm64 needs to know this to properly implement timers, for example). Therefore provide a generic architecture callback function in line with what we do elsewhere for KVM generic-arch interactions. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:41 +02:00
Himangi Saraogi	39e69f55f8	powerpc: Introduce the use of the managed version of kzalloc This patch moves data allocated using kzalloc to managed data allocated using devm_kzalloc and cleans now unnecessary kfree in probe function. The following Coccinelle semantic patch was used for making the change: @platform@ identifier p, probefn, removefn; @@ struct platform_driver p = { .probe = probefn, .remove = removefn, }; @prb@ identifier platform.probefn, pdev; expression e, e1, e2; @@ probefn(struct platform_device *pdev, ...) { <+... - e = kzalloc(e1, e2) + e = devm_kzalloc(&pdev->dev, e1, e2) ... ?-kfree(e); ...+> } Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Anatolij Gustschin <agust@denx.de>	2015-10-22 16:13:23 +02:00
Uwe Kleine-König	ffcea122c4	powerpc: mpc512x: drop bogus and unused psc register bit definitions These were introduced in commit `25ae3a0739` ("[POWERPC] mpc512x: Add MPC512x PSC support to MPC52xx psc driver") and never used. Moreover according to the datasheet[1] MEMERROR is bit 25 (0x40) and ORERR is bit 27 (0x10). [1] MPC5125RM Rev. 2; 11/2009 Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Anatolij Gustschin <agust@denx.de>	2015-10-22 16:06:08 +02:00
Alexander Popov	de03fe287b	powerpc/512x: add a device tree binding for LocalPlus Bus FIFO Add a device tree binding for Freescale MPC512x LocalPlus Bus FIFO and introduce the document describing that binding. Signed-off-by: Alexander Popov <alex.popov@linux.com> Signed-off-by: Anatolij Gustschin <agust@denx.de>	2015-10-22 15:20:47 +02:00
Alexander Popov	1a4bb93f79	powerpc/512x: add LocalPlus Bus FIFO device driver This driver for Freescale MPC512x LocalPlus Bus FIFO (called SCLPC in the Reference Manual) allows Direct Memory Access transfers between RAM and peripheral devices on LocalPlus Bus. Signed-off-by: Alexander Popov <alex.popov@linux.com> Signed-off-by: Anatolij Gustschin <agust@denx.de>	2015-10-22 15:19:40 +02:00
Luis de Bethencourt	7e610890b5	powerpc: platforms: mpc52xx_lpbfifo: Fix module autoload for OF platform driver This platform driver has a OF device ID table but the OF module alias information is not created so module autoloading won't work. Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com> Signed-off-by: Anatolij Gustschin <agust@denx.de>	2015-10-22 14:36:53 +02:00
Vasant Hegde	8832317f66	powerpc/rtas: Validate rtas.entry before calling enter_rtas() Currently we do not validate rtas.entry before calling enter_rtas(). This leads to a kernel oops when user space calls rtas system call on a powernv platform (see below). This patch adds code to validate rtas.entry before making enter_rtas() call. Oops: Exception in kernel mode, sig: 4 [#1] SMP NR_CPUS=1024 NUMA PowerNV task: c000000004294b80 ti: c0000007e1a78000 task.ti: c0000007e1a78000 NIP: 0000000000000000 LR: 0000000000009c14 CTR: c000000000423140 REGS: c0000007e1a7b920 TRAP: 0e40 Not tainted (3.18.17-340.el7_1.pkvm3_1_0.2400.1.ppc64le) MSR: 1000000000081000 <HV,ME> CR: 00000000 XER: 00000000 CFAR: c000000000009c0c SOFTE: 0 NIP [0000000000000000] (null) LR [0000000000009c14] 0x9c14 Call Trace: [c0000007e1a7bba0] [c00000000041a7f4] avc_has_perm_noaudit+0x54/0x110 (unreliable) [c0000007e1a7bd80] [c00000000002ddc0] ppc_rtas+0x150/0x2d0 [c0000007e1a7be30] [c000000000009358] syscall_exit+0x0/0x98 Cc: stable@vger.kernel.org # v3.2+ Fixes: `55190f8878` ("powerpc: Add skeleton PowerNV platform") Reported-by: NAGESWARA R. SASTRY <nasastry@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [mpe: Reword change log, trim oops, and add stable + fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-22 11:03:25 +11:00
Scott Wood	9484865447	powerpc/fsl: Move fsl_guts.h out of arch/powerpc Freescale's Layerscape ARM chips use the same structure. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-21 18:05:50 -05:00
Paul Mackerras	53c656c413	powerpc/powernv: Handle irq_happened flag correctly in off-line loop This fixes a bug where it is possible for an off-line CPU to fail to go into a low-power state (nap/sleep/winkle), and to become unresponsive to requests from the KVM subsystem to wake up and run a VCPU. What can happen is that a maskable interrupt of some kind (external, decrementer, hypervisor doorbell, or HMI) after we have called local_irq_disable() at the beginning of pnv_smp_cpu_kill_self() and before interrupts are hard-disabled inside power7_nap/sleep/winkle(). In this situation, the pending event is marked in the irq_happened flag in the PACA. This pending event prevents power7_nap/sleep/winkle from going to the requested low-power state; instead they return immediately. We don't deal with any of these pending event flags in the off-line loop in pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case, so we will have srr1 == 0, and none of the processing to clear interrupts or doorbells will be done. Usually, the most obvious symptom of this is that a KVM guest will fail with a console message saying "KVM: couldn't grab cpu N". This fixes the problem by making sure we handle the irq_happened flags properly. First, we hard-disable before the off-line loop. Once we have hard-disabled, the irq_happened flags can't change underneath us. We unconditionally clear the DEC and HMI flags: there is no processing of timer interrupts while off-line, and the necessary HMI processing is all done in lower-level code. We leave the EE and DBELL flags alone for the first iteration of the loop, so that we won't fail to respond to a split-core request that came in just before hard-disabling. Within the loop, we handle external interrupts if the EE bit is set in irq_happened as well as if the low-power state was interrupted by an external interrupt. (We don't need to do the msgclr for a pending doorbell in irq_happened, because doorbells are edge-triggered and don't remain pending in hardware.) Then we clear both the EE and DBELL flags, and once clear, they cannot be set again (until this CPU comes online again, that is). This also fixes the debug check to not be done when we just ran a KVM guest or when the sleep didn't happen because of a pending event in irq_happened. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:52:49 +11:00
Paul Mackerras	23316316c1	powerpc: Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8" This reverts commit `9678cdaae9` ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8") because the original commit had multiple, partly self-cancelling bugs, that could cause occasional memory corruption. In fact the logmpp instruction was incorrectly using register r0 as the source of the buffer address and operation code, and depending on what was in r0, it would either do nothing or corrupt the 64k page pointed to by r0. The logmpp instruction encoding and the operation code definitions could be corrected, but then there is the problem that there is no clearly defined way to know when the hardware has finished writing to the buffer. The original commit attempted to work around this by aborting the write-out before starting the prefetch, but this is ineffective in the case where the virtual core is now executing on a different physical core from the one where the write-out was initiated. These problems plus advice from the hardware designers not to use the function (since the measured performance improvement from using the feature was actually mostly negative), mean that reverting the code is the best option. Fixes: `9678cdaae9` ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8") Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:50:30 +11:00
Gavin Shan	353169acf1	powerpc/eeh: Fix recursive fenced PHB on Broadcom shiner adapter Similar to commit `b6541db` ("powerpc/eeh: Block PCI config access upon frozen PE"), this blocks the PCI config space of Broadcom Shiner adapter until PE reset is completed, to avoid recursive fenced PHB when dumping PCI config registers during the period of error recovery. ~# lspci -ns 0003:03:00.0 0003:03:00.0 0200: 14e4:168a (rev 10) ~# lspci -s 0003:03:00.0 0003:03:00.0 Ethernet controller: Broadcom Corporation \ NetXtreme II BCM57800 1/10 Gigabit Ethernet (rev 10) Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:42:16 +11:00
Gavin Shan	f9433718d6	powerpc/powernv: Simplify pnv_eeh_set_option() This simplifies pnv_eeh_set_option() to avoid unnecessary nested if statements, to improve readability. No functional changes. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:42:15 +11:00
Gavin Shan	4d6186ca6f	powerpc/powernv: Remove pnv_eeh_cap_start() This moves the logic of pnv_eeh_cap_start() to pnv_eeh_find_cap() as the function is only called by pnv_eeh_find_cap(). The logic of both functions are pretty simple. No need to have separate functions. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:42:15 +11:00
Gavin Shan	608fb9c296	powerpc/powernv: Cleanup on EEH comments This applies cleanup on eeh-powernv.c, no functional changes: * Remove unnecessary comments and empty line. * Correct inaccurate comments. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:42:15 +11:00
Gavin Shan	00ba05a12b	powerpc/pseries: Cleanup on pseries_eeh_get_state() This cleans up pseries_eeh_get_state(), no functional changes: * Return EEH_STATE_NOT_SUPPORT early when the 2nd RTAS output argument is zero to avoid nested if statements. * Skip clearing bits in the PE state represented by variable "result" to simplify the code. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:42:14 +11:00
Gavin Shan	872ee2d652	powerpc/eeh: More relaxed condition for enabled IO path When one or both of the below two flags are marked in the PE state, the PE's IO path is regarded as enabled: EEH_STATE_MMIO_ACTIVE or EEH_STATE_MMIO_ENABLED. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:41:43 +11:00
Gavin Shan	8234fcedf1	powerpc/eeh: Force reset on fenced PHB On fenced PHB, the error handlers in the drivers of its subordinate devices could return PCI_ERS_RESULT_CAN_RECOVER, indicating no reset will be issued during the recovery. It's conflicting with the fact that fenced PHB won't be recovered without reset. This limits the return value from the error handlers in the drivers of the fenced PHB's subordinate devices to PCI_ERS_RESULT_NEED_NONE or PCI_ERS_RESULT_NEED_RESET, to ensure reset will be issued during recovery. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:41:43 +11:00
Gavin Shan	f2da4ccf8b	powerpc/eeh: More relaxed hotplug criterion Currently, we rely on the existence of struct pci_driver::err_handler to decide if the corresponding PCI device should be unplugged during EEH recovery (partially hotplug case). However that check is not sufficient. Some device drivers implement only some of the EEH error handlers to collect diag-data. That means the driver still expects a hotplug to recover from the EEH error. This makes the hotplug criterion more relaxed: if the device driver doesn't provide all necessary EEH error handlers, it will experience hotplug during EEH recovery. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> [mpe: Minor change log rewording] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:39:07 +11:00
Gavin Shan	527d10ef3a	powerpc/eeh: Don't unfreeze PHB PE after reset On PowerNV platform, the PE is kept in frozen state until the PE reset is completed to avoid recursive EEH error caused by MMIO access during the period of EEH reset. The PE's frozen state is cleared after BARs of PCI device included in the PE are restored and enabled. However, we needn't clear the frozen state for PHB PE explicitly at this point as there is no real PE for PHB PE. As the PHB PE is always binding with PE#0, we actually clear PE#0, which is wrong. It doesn't incur any problem though. This checks if the PE is PHB PE and doesn't clear the frozen state if it is. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:06:57 +11:00
Geoff Levand	879c26d4f6	powerpc/ps3: Quieten boot wrapper output with run_cmd Add a boot wrapper script function run_cmd which will run a shell command quietly and only print the output if either V=1 or an error occurs. Also, run the ps3 dd commands with run_cmd to clean up the build output. Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 20:06:56 +11:00
Gautham R. Shenoy	70aa3961a1	KVM: PPC: Book3S HV: Handle H_DOORBELL on the guest exit path Currently a CPU running a guest can receive a H_DOORBELL in the following two cases: 1) When the CPU is napping due to CEDE or there not being a guest vcpu. 2) The CPU is running the guest vcpu. Case 1), the doorbell message is not cleared since we were waking up from nap. Hence when the EE bit gets set on transition from guest to host, the H_DOORBELL interrupt is delivered to the host and the corresponding handler is invoked. However in Case 2), the message gets cleared by the action of taking the H_DOORBELL interrupt. Since the CPU was running a guest, instead of invoking the doorbell handler, the code invokes the second-level interrupt handler to switch the context from the guest to the host. At this point the setting of the EE bit doesn't result in the CPU getting the doorbell interrupt since it has already been delivered once. So, the handler for this doorbell is never invoked! This causes softlockups if the missed DOORBELL was an IPI sent from a sibling subcore on the same CPU. This patch fixes it by explitly invoking the doorbell handler on the exit path if the exit reason is H_DOORBELL similar to the way an EXTERNAL interrupt is handled. Since this will also handle Case 1), we can unconditionally clear the doorbell message in kvmppc_check_wake_reason. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-21 16:31:52 +11:00
Nikunj A Dadhania	bfec5c2cc0	KVM: PPC: Implement extension to report number of memslots QEMU assumes 32 memslots if this extension is not implemented. Although, current value of KVM_USER_MEM_SLOTS is 32, once KVM_USER_MEM_SLOTS changes QEMU would take a wrong value. Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-21 16:31:46 +11:00
Paul Mackerras	c64dfe2af3	KVM: PPC: Book3S HV: Make H_REMOVE return correct HPTE value for absent HPTEs This fixes a bug where the old HPTE value returned by H_REMOVE has the valid bit clear if the HPTE was an absent HPTE, as happens for HPTEs for emulated MMIO pages and for RAM pages that have been paged out by the host. If the absent bit is set, we clear it and set the valid bit, because from the guest's point of view, the HPTE is valid. Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-21 16:25:06 +11:00
Paul Mackerras	572abd563b	KVM: PPC: Book3S HV: Don't fall back to smaller HPT size in allocation ioctl Currently the KVM_PPC_ALLOCATE_HTAB will try to allocate the requested size of HPT, and if that is not possible, then try to allocate smaller sizes (by factors of 2) until either a minimum is reached or the allocation succeeds. This is not ideal for userspace, particularly in migration scenarios, where the destination VM really does require the size requested. Also, the minimum HPT size of 256kB may be insufficient for the guest to run successfully. This removes the fallback to smaller sizes on allocation failure for the KVM_PPC_ALLOCATE_HTAB ioctl. The fallback still exists for the case where the HPT is allocated at the time the first VCPU is run, if no HPT has been allocated by ioctl by that time. Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-21 16:24:57 +11:00
Christophe Jaillet	1856f50c66	powerpc/prom: Avoid reference to potentially freed memory of_get_property() is used inside the loop, but then the reference to the node is dropped before dereferencing the prop pointer, which could by then point to junk if the node has been freed. Instead use of_property_read_u32() to actually read the property value before dropping the reference. of_property_read_u32() requires at least one cell (u32) to be present, which is stricter than the old logic which would happily dereference a property of any size. However we believe all device trees in the wild have at least one cell. Skiboot may produce memory nodes with more than one cell, but that is OK, of_property_read_u32() will return the first one. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [mpe: Expand change log with device tree details] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-21 15:31:25 +11:00
David S. Miller	26440c835f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/asix_common.c net/ipv4/inet_connection_sock.c net/switchdev/switchdev.c In the inet_connection_sock.c case the request socket hashing scheme is completely different in net-next. The other two conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-20 06:08:27 -07:00
Stephane Eranian	24f1a79a5f	perf/powerpc: Add support for PERF_SAMPLE_BRANCH_CALL The patch catches PERF_SAMPLE_BRANCH_CALL because it is not clear whether this is actually supported by the hardware. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: khandual@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1444720151-10275-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-20 10:30:54 +02:00
Michael Ellerman	bed08b7e1f	powerpc/cell: Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU The TUNE_CELL option allows you to build a kernel that runs on multiple CPUs but is tuned (ie. optimised) to run on Cell CPUs. Now days no one is building a distro in that fashion, and any users who are building custom kernels for their Cell machines are better off building with CONFIG_CELL_CPU, which builds a kernel that only runs on Cell and therefore can be optimised even more aggresively. Dropping the option also avoids confusing other users, who are presented with an option to tune for Cell when they are not building for a Cell CPU at all. Suggested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-19 19:51:18 +11:00
$Hon Ching $Vicky$ Lo$ Hon Ching $Vicky$ Lo	9e5d4af458	vTPM: get the buffer allocated for event log instead of the actual log The OS should ask Power Firmware (PFW) for the size of the buffer allocated for the event log, instead of the size of the actual event log. It then passes the buffer adddress and size to PFW in the handover process, into which PFW copies the log. Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>	2015-10-19 01:01:23 +02:00
$Hon Ching $Vicky$ Lo$ Hon Ching $Vicky$ Lo	b4ed0469d0	vTPM: reformat event log to be byte-aligned The event log generated by OpenFirmware in PowerPC is 4-byte aligned. This patch reformats the log to be byte-aligned for the Linux client. Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>	2015-10-19 01:01:23 +02:00
$Hon Ching $Vicky$ Lo$ Hon Ching $Vicky$ Lo	2f82e98265	vTPM: fix searching for the right vTPM node in device tree Replace all occurrences of '/ibm,vtpm' with '/vdevice/vtpm', as only the latter is guanranteed to be available for the client OS. The '/ibm,vtpm' node should only be used by Open Firmware, which is susceptible to changes. Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>	2015-10-19 01:01:22 +02:00
Scott Wood	1930bb5ccf	powerpc/fsl_pci: Don't set up inbound windows in kdump crash kernel Otherwise, because the top end of the crash kernel is treated as the absolute top of memory rather than the beginning of a reserved region, in-flight DMA from the previous kernel that targets areas above the crash kernel can trigger a storm of PCI errors. We only do this for kdump, not normal kexec, in case kexec is being used to upgrade to a kernel that wants a different inbound memory map. Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Mingkai Hu <Mingkai.hu@freescale.com>	2015-10-17 00:36:37 -05:00
Scott Wood	1112450a18	powerpc/85xx: Don't use generic timebase sync on 64-bit 85xx currently uses the generic timebase sync mechanism when CONFIG_KEXEC is enabled, because 32-bit 85xx kexec support does a hard reset of each core. 64-bit 85xx kexec does not do this, so we neither need nor want this (nor is the generic timebase sync code built on ppc64). FWIW, I don't like the fact that the hard reset is done on 32-bit kexec, and I especially don't like the timebase sync being triggered only on the presence of CONFIG_KEXEC rather than actually booting in that environment, but that's beyond the scope of this patch... Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-17 00:36:36 -05:00
Scott Wood	9f640bf532	powerpc/fsl-corenet: Disable coreint if kexec is enabled Problems have been observed in coreint (EPR) mode if interrupts are left pending (due to the lack of device quiescence with kdump) after having tried to deliver to a CPU but unable to deliver due to MSR[EE] -- interrupts no longer get reliably delivered in the new kernel. I tried various ways of fixing it up inside the crash kernel itself, and none worked (including resetting the entire mpic). Masking all interrupts and issuing EOIs in the crashing kernel did help a lot of the time, but the behavior was not consistent. Thus, stick to standard IACK mode when kdump is a possibility. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-17 00:36:36 -05:00
Scott Wood	01c593d749	powerpc/fsl-booke-64: Allow booting from the secondary thread This allows SMP kernels to work as kdump crash kernels. While crash kernels don't really need to be SMP, this prevents things from breaking if a user does it anyway (which is not something you want to only find out once the main kernel has crashed in the field, especially if whether it works or not depends on which cpu crashed). Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-17 00:36:35 -05:00
poonam aggrwal	5224644551	powerpc/b4860: Renamed the L2 caches To make provision for more than one L2 caches in the system, change the name from L2 to L2_1; same as in T4 platforms. * Also remove the L2 entry from common file "arch/powerpc/boot/dts/fsl/b4si-post.dtsi" Keep them only in separate files for b4860 and b4420. Signed-off-by: Shaveta Leekha <shaveta@freescale.com> Signed-off-by: Poonam Aggrwal <poonam.aggrwal@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-17 00:36:34 -05:00
Hongtao Jia	dc37374b9c	powerpc/fsl: Move Freescale device tree files into fsl folder It makes no sense that some Freescale device tree files are in fsl directory while some others not. This patch move Freescale device tree files into fsl folder. To do that the following two steps are made: - Move Freescale device tree files into fsl folder. - Update the include path in these files from "fsl/.dtsi" to ".dtsi". Please add "fsl/" prefix when you make dtb using Makefile. Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> [scottwood: fixed cuImage rule] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-17 00:36:34 -05:00
poonam aggrwal	2d80f651c8	powerpc/b4860: Removed LIODN register from sRIO node In case of B4860 LIODN register for sRIO is not in GUTs block but in the sRIO register space. Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Poonam Aggrwal <poonam.aggrwal@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-17 00:36:33 -05:00
Andy Fleming	c383ee84e1	powerpc/85xx: Add support for Varisys Cyrus board This board uses a P5020 chip, and boots just fine using the corenet_generic code. The device tree is very similar to the P5020DS, except that there is no Flash memory. The environment is, instead, stored on an MMC card on the motherboard. Signed-off-by: Andy Fleming <afleming@gmail.com> [scottwood: fixed trailing whitespace] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-17 00:36:33 -05:00
Scott Wood	072daeed55	powerpc/fsl_pci: Check for get_user/probe_kernel_address failure Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hongtao Jia <hongtao.jia@freescale.com>	2015-10-17 00:35:41 -05:00
Zhao Qiang	a109752ea6	powerpc/t104xd4rdb: add DS26522 nodes to device tree DS26522 is used for tdm, configured by SPI bus. Add nodes under spi node to t104xd4rdb.dtsi. Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-16 19:32:11 -05:00
Uwe Kleine-König	78c102c24c	powerpc/dts: don't fall back to fsl,pq3-gpio for fsl,mpc8572-gpio While the handling of fsl,pq3-gpio and fsl,mpc8572-gpio is done in the same driver and the two hardly differ, the latter controller needs a workaround for an erratum in the gpio_get callback. To make this difference more explicit remove fsl,pq3-gpio from the list of compatibles for mpc8572 machines. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-16 18:46:00 -05:00
Yangbo Lu	07e9117e43	powerpc/dts: Add and fix 1588 timer node for eTSEC Add 1588 timer node in files: arch/powerpc/boot/dts/bsc9131rdb.dtsi arch/powerpc/boot/dts/bsc9132qds.dtsi arch/powerpc/boot/dts/p1010rdb.dtsi arch/powerpc/boot/dts/p1020rdb-pd.dts arch/powerpc/boot/dts/p1021rdb-pc.dtsi arch/powerpc/boot/dts/p1022ds.dtsi arch/powerpc/boot/dts/p1025twr.dtsi For P2020RDB-PC, registers' values should be calculated based on default 1588 reference clock(300MHz) not 250MHz, and fix this in file: arch/powerpc/boot/dts/p2020rdb-pc.dtsi Signed-off-by: Yangbo Lu <yangbo.lu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-16 18:45:11 -05:00
chenhui zhao	881ea7d3f5	powerpc/corenet: use the mixed mode of MPIC when enabling CPU hotplug Core reset may cause issue if using the proxy mode of MPIC. Use the mixed mode of MPIC if enabling CPU hotplug. Signed-off-by: Chenhui Zhao <chenhui.zhao@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-10-16 18:44:27 -05:00
Linus Torvalds	ebb65c81e1	powerpc fixes for 4.3 #3 - Re-enable CONFIG_SCSI_DH in our defconfigs - Remove unused os_area_db_id_video_mode - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew - cxl: Workaround malformed pcie packets on some cards from Philippe - cxl: Fix number of allocated pages in SPA from Christophe Lombard - Fix checkstop in native_hpte_clear() with lockdep from Cyril - Panic on unhandled Machine Check on powernv from Daniel - selftests/powerpc: Fix build failure of load_unaligned_zeropad test -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWIM0xAAoJEFHr6jzI4aWAsCIP/04uAiPCqWOwHjr8/eAlNAmJ GaA6b91QUUpBlyXgzYZShS/FQEnyukbGTUzaS3KwijOdRJtCHxvl2eG7pOCws+GS 2YeA9mBm7MgYT0BJ+KLGCgrF5C/sc+LN3udO9Kf1LimLpp+fIILHgEmhrfy00wUp f7tJ/Rvpt23PmcCDX0PhA7NuOrRu5hQOQ9rsqJfzc7XObZAG1AfISPgALgaeAINc XqQfWiNFLmDJyhV9K39rUXSTvHYl6pPnfDj4GelfjQD2l/csH0M4MeGW2tHNkgVy CakLWOP3zdZVTYTcB8wypnoZxATPhEsHehJmQ4fu3n0WR1vHfCqh4rFZuPaaX0NG P3In0eOV285RIpNLcwkchN+07Ops1Fvi5XonaQpgHCcI9c4H7IAGPbQau2DhR9sU DyZQ+/6wNzpXbM7llM3VyTA2zvvyiuEzuIZI78XWexO/Ny6TCItRtEqJEXMA+ChX lKbLluRnQcnn5sizK0yj4mtkffAbu7Za1KGl1nm1Q/5pBQWsC40wFcRLNNdzqVmH 7tSp8cIEYunCYKy5bAheWJTzpUgGD55EEcUkQFHVm5LKBXyA73qJRSMuLZqtnB3z g6eTiEKhZvVFedNMDNFnNWrvOnd8JpyjGLRAbqgwMhN+lgVvmwwSSB6V2SefMnuL HCSGqR40vPA9bH0Cz/ND =3ze+ -----END PGP SIGNATURE----- Merge tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Re-enable CONFIG_SCSI_DH in our defconfigs - Remove unused os_area_db_id_video_mode - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew - cxl: Workaround malformed pcie packets on some cards from Philippe - cxl: Fix number of allocated pages in SPA from Christophe Lombard - Fix checkstop in native_hpte_clear() with lockdep from Cyril - Panic on unhandled Machine Check on powernv from Daniel - selftests/powerpc: Fix build failure of load_unaligned_zeropad test * tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix build failure of load_unaligned_zeropad test powerpc/powernv: Panic on unhandled Machine Check powerpc: Fix checkstop in native_hpte_clear() with lockdep cxl: Fix number of allocated pages in SPA cxl: Workaround malformed pcie packets on some cards cxl: fix leak of ctx->mapping when releasing kernel API contexts cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API cxl: fix leak of IRQ names in cxl_free_afu_irqs() powerpc/ps3: Remove unused os_area_db_id_video_mode powerpc/configs: Re-enable CONFIG_SCSI_DH	2015-10-16 12:07:43 -07:00
Mahesh Salgaonkar	966d713e86	KVM: PPC: Book3S HV: Deliver machine check with MSR(RI=0) to guest as MCE For the machine check interrupt that happens while we are in the guest, kvm layer attempts the recovery, and then delivers the machine check interrupt directly to the guest if recovery fails. On successful recovery we go back to normal functioning of the guest. But there can be cases where a machine check interrupt can happen with MSR(RI=0) while we are in the guest. This means MC interrupt is unrecoverable and we have to deliver a machine check to the guest since the machine check interrupt might have trashed valid values in SRR0/1. The current implementation do not handle this case, causing guest to crash with Bad kernel stack pointer instead of machine check oops message. [26281.490060] Bad kernel stack pointer 3fff9ccce5b0 at c00000000000490c [26281.490434] Oops: Bad kernel stack pointer, sig: 6 [#1] [26281.490472] SMP NR_CPUS=2048 NUMA pSeries This patch fixes this issue by checking MSR(RI=0) in KVM layer and forwarding unrecoverable interrupt to guest which then panics with proper machine check Oops message. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-16 11:53:47 +11:00
Michael Ellerman	ad987fc8ea	powerpc/xmon: Add some more elements to the existing PACA dump list This patch adds a set of new elements to the existing PACA dump list inside an xmon session which can be listed below improving the overall xmon debug support. With this patch, a typical xmon PACA dump looks something like this. paca for cpu 0x0 @ c00000000fdc0000: possible = yes present = yes online = yes lock_token = 0x8000 (0xa) paca_index = 0x0 (0x8) kernel_toc = 0xc000000001393200 (0x10) kernelbase = 0xc000000000000000 (0x18) kernel_msr = 0xb000000000001033 (0x20) emergency_sp = 0xc00000003fff0000 (0x28) mc_emergency_sp = 0xc00000003ffec000 (0x2e0) in_mce = 0x0 (0x2e8) hmi_event_available = 0x0 (0x2ea) data_offset = 0x1fe7b0000 (0x30) hw_cpu_id = 0x0 (0x38) cpu_start = 0x1 (0x3a) kexec_state = 0x0 (0x3b) slb_shadow[0]: = 0xc000000008000000 0x40016e7779000510 slb_shadow[1]: = 0xd000000008000001 0x400142add1000510 vmalloc_sllp = 0x510 (0x1b8) slb_cache_ptr = 0x4 (0x1ba) slb_cache[0]: = 0x000000000003f000 slb_cache[1]: = 0x0000000000000001 slb_cache[2]: = 0x0000000000000003 slb_cache[3]: = 0x0000000000001000 slb_cache[4]: = 0x0000000000001000 slb_cache[5]: = 0x0000000000000000 slb_cache[6]: = 0x0000000000000000 slb_cache[7]: = 0x0000000000000000 dscr_default = 0x0 (0x58) __current = 0xc000000001331e80 (0x290) kstack = 0xc000000001393e30 (0x298) stab_rr = 0x11 (0x2a0) saved_r1 = 0xc0000001fffef5e0 (0x2a8) trap_save = 0x0 (0x2b8) soft_enabled = 0x0 (0x2ba) irq_happened = 0x1 (0x2bb) io_sync = 0x0 (0x2bc) irq_work_pending = 0x0 (0x2bd) nap_state_lost = 0x0 (0x2be) sprg_vdso = 0x0 (0x2c0) tm_scratch = 0x8000000100009033 (0x2c8) core_idle_state_ptr = (null) (0x2d0) thread_idle_state = 0x0 (0x2d8) thread_mask = 0x0 (0x2d9) subcore_sibling_mask = 0x0 (0x2da) user_time = 0x0 (0x2f0) system_time = 0x0 (0x2f8) user_time_scaled = 0x0 (0x300) starttime = 0x3f462418b5cf4 (0x308) starttime_user = 0x3f4622a57092a (0x310) startspurr = 0xd62a5718 (0x318) utime_sspurr = 0x0 (0x320) stolen_time = 0x0 (0x328) Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Endian swap slb_shadow before display, minor formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:32:02 +11:00
Sam bobroff	0c23a88ccc	powerpc/xmon: Paginate kernel log buffer display The kernel log buffer is often much longer than the size of a terminal so paginate it's output. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:32:02 +11:00
Sam bobroff	958b7c8050	powerpc/xmon: Paged output for paca display The paca display is already more than 24 lines, which can be problematic if you have an old school 80x24 terminal, or more likely you are on a virtual terminal which does not scroll for whatever reason. This patch adds a new command "#", which takes a single (hex) numeric argument: lines per page. It will cause the output of "dp" and "dpa" to be broken into pages, if necessary. Sample output: 0:mon> # 10 0:mon> dp1 paca for cpu 0x1 @ c00000000fdc0480: possible = yes present = yes online = yes lock_token = 0x8000 (0x8) paca_index = 0x1 (0xa) kernel_toc = 0xc000000000eb2400 (0x10) kernelbase = 0xc000000000000000 (0x18) kernel_msr = 0xb000000000001032 (0x20) emergency_sp = 0xc00000003ffe8000 (0x28) mc_emergency_sp = 0xc00000003ffe4000 (0x2e0) in_mce = 0x0 (0x2e8) data_offset = 0x7f170000 (0x30) hw_cpu_id = 0x8 (0x38) cpu_start = 0x1 (0x3a) kexec_state = 0x0 (0x3b) [Hit a key (a:all, q:truncate, any:next page)] 0:mon> __current = 0xc00000007e696620 (0x290) kstack = 0xc00000007e6ebe30 (0x298) stab_rr = 0xb (0x2a0) saved_r1 = 0xc00000007ef37860 (0x2a8) trap_save = 0x0 (0x2b8) soft_enabled = 0x0 (0x2ba) irq_happened = 0x1 (0x2bb) io_sync = 0x0 (0x2bc) irq_work_pending = 0x0 (0x2bd) nap_state_lost = 0x0 (0x2be) 0:mon> Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> [mpe: Use bool, make some variables static] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:32:01 +11:00
Christophe Jaillet	b340587e68	powerpc/mpc5xxx: Use of_get_next_parent to simplify code of_get_next_parent can be used to simplify the while() loop and avoid the need of a temp variable. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:32:01 +11:00
Christophe Jaillet	1def37586f	powerpc/numa: Use of_get_next_parent to simplify code of_get_next_parent can be used to simplify the while() loop and avoid the need of a temp variable. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:32:00 +11:00
Paul Gortmaker	5fab1d1cb1	powerpc: Delete old orphaned PrPMC 280/2800 DTS and boot file. In commit `3c8464a9b1` ("powerpc: Delete old PrPMC 280/2800 support") we got rid of most of the C code, and the Makefile/Kconfig hooks, but it seems I left the platform's DTS file orphaned in the tree as well as the boot code. Here we get rid of them both. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:32:00 +11:00
Stephen Rothwell	4c8123181d	powerpc: discard .exit.data at runtime .exit.text is discarded at run time and there are some references from that to .exit.data, so we need to discard .exit.data at run time as well. Fixes these errors: `.exit.data' referenced in section `.exit.text' of drivers/built-in.o: defined in discarded section `.exit.data' of drivers/built-in.o `.exit.data' referenced in section `.exit.text' of drivers/built-in.o: defined in discarded section `.exit.data' of drivers/built-in.o Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:31:59 +11:00
Gavin Shan	54f9a64a36	powerpc/eeh: atomic_dec_if_positive() to update passthru count No need to have two atomic opertions (update and fetch/check) when decreasing PE's number of passed devices as one atomic operation is enough. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:31:58 +11:00
Andrew Donnellan	6b8b252f40	powerpc/pci: export pcibios_free_controller() Export pcibios_free_controller(), so it can be used by the cxl module to free virtual PHBs. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:31:57 +11:00
Sam bobroff	a34236155a	powerpc: Individual System V IPC system calls This patch provides individual system call numbers for the following System V IPC system calls, on PowerPC, so that they do not need to be multiplexed: * semop, semget, semctl, semtimedop * msgsnd, msgrcv, msgget, msgctl * shmat, shmdt, shmget, shmctl Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:31:57 +11:00
Michael Ellerman	f7688056cb	powerpc/pseries: Drop always true CONFIG_PSERIES_MSI Now that pseries selects PCI_MSI && PCI, EEH will always be true, and therefore CONFIG_PSERIES_MSI will always be true. So drop it, and move msi.o to obj-y. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:31:56 +11:00
Michael Ellerman	44f2aecfd0	powerpc/pseries: Move PCI objects to obj-y Make it entirely clear in the Makefile that we always build the pci related files by moving them to obj-y. Note that CONFIG_EEH is now always enabled on pseries, because it depends on PSERIES && PCI. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:31:56 +11:00
Michael Ellerman	84eb9e612b	powerpc/pseries: Remove use of CONFIG_PCI Now that we always have CONFIG_PCI=y for pseries, we can stop guarding code with CONFIG_PCI ifdefs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:31:55 +11:00
Michael Ellerman	4c9cd468b3	powerpc/pseries: Make PCI non-optional The pseries build with PCI=n looks to have been broken for at least 5 years, and no one's noticed or cared. Following the obvious breakages backward, the first commit I can find that builds is the parent of `2eb4afb69f` ("powerpc/pci: Move pseries code into pseries platform specific area") from April 2009. A distro would never ship a PCI=n kernel, so it is only useful for folks building custom kernels. Also on KVM the virtio devices appear on PCI, so it would only be useful if you were building kernels specifically to run on PowerVM and with no PCI devices. The added code complexity, and testing load (which we've clearly not been doing), is not justified by the small reduction in kernel size for such a niche use case. So just make PCI non-optional on pseries. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-15 20:31:54 +11:00
Tudor Laurentiu	224f363246	KVM: PPC: e500: fix couple of shift operations on 64 bits Fix couple of cases where we shift left a 32-bit value thus might get truncated results on 64-bit targets. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Suggested-by: Scott Wood <scotttwood@freescale.com> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-15 15:59:19 +11:00
Tudor Laurentiu	2daab50e17	KVM: PPC: e500: Emulate TMCFG0 TMRN register Emulate TMCFG0 TMRN register exposing one HW thread per vcpu. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> [Laurentiu.Tudor@freescale.com: rebased on latest kernel, use define instead of hardcoded value, moved code in own function] Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Acked-by: Scott Wood <scotttwood@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-15 15:58:16 +11:00
Andrzej Hajda	d4cd4f9586	KVM: PPC: e500: fix handling local_sid_lookup result The function can return negative value. The problem has been detected using proposed semantic patch scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2046107 Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-15 15:58:16 +11:00
Tudor Laurentiu	6a14c22224	powerpc/e6500: add TMCFG0 register definition The register is not currently used in the base kernel but will be in a forthcoming kvm patch. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-10-15 15:58:16 +11:00
Marc Zyngier	5d4c9bc776	irqdomain: Use irq_domain_get_of_node() instead of direct field access The struct irq_domain contains a "struct device_node *" field (of_node) that is almost the only link between the irqdomain and the device tree infrastructure. In order to prepare for the removal of that field, convert all users to use irq_domain_get_of_node() instead. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: <linux-arm-kernel@lists.infradead.org> Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Graeme Gregory <graeme@xora.org.uk> Cc: Jake Oshins <jakeo@microsoft.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Link: http://lkml.kernel.org/r/1444737105-31573-2-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-10-13 19:01:23 +02:00
Aneesh Kumar K.V	891121e6c0	powerpc/mm: Differentiate between hugetlb and THP during page walk We need to properly identify whether a hugepage is an explicit or a transparent hugepage in follow_huge_addr(). We used to depend on hugepage shift argument to do that. But in some case that can result in wrong results. For ex: On finding a transparent hugepage we set hugepage shift to PMD_SHIFT. But we can end up clearing the thp pte, via pmdp_huge_get_and_clear. We do prevent reusing the pfn page via the usage of kick_all_cpus_sync(). But that happens after we updated the pte to 0. Hence in follow_huge_addr() we can find hugepage shift set, but transparent huge page check fail for a thp pte. NOTE: We fixed a variant of this race against thp split in commit `691e95fd73` ("powerpc/mm/thp: Make page table walk safe against thp split/collapse") Without this patch, we may hit the BUG_ON(flags & FOLL_GET) in follow_page_mask occasionally. In the long term, we may want to switch ppc64 64k page size config to enable CONFIG_ARCH_WANT_GENERAL_HUGETLB Reported-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-12 15:30:09 +11:00
Aneesh Kumar K.V	ec2640b114	powerpc/mm: Disable hugepd for 64K page size. After commit `e2b3d202d1` ("powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format"), we don't need to support is_hugepd() for 64K page size. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-12 15:29:59 +11:00
Daniel Axtens	f2dd80ecca	powerpc/powernv: Panic on unhandled Machine Check All unrecovered machine check errors on PowerNV should cause an immediate panic. There are 2 reasons that this is the right policy: it's not safe to continue, and we're already trying to reboot. Firstly, if we go through the recovery process and do not successfully recover, we can't be sure about the state of the machine, and it is not safe to recover and proceed. Linux knows about the following sources of Machine Check Errors: - Uncorrectable Errors (UE) - Effective - Real Address Translation (ERAT) - Segment Lookaside Buffer (SLB) - Translation Lookaside Buffer (TLB) - Unknown/Unrecognised In the SLB, TLB and ERAT cases, we can further categorise these as parity errors, multihit errors or unknown/unrecognised. We can handle SLB errors by flushing and reloading the SLB. We can handle TLB and ERAT multihit errors by flushing the TLB. (It appears we may not handle TLB and ERAT parity errors: I will investigate further and send a followup patch if appropriate.) This leaves us with uncorrectable errors. Uncorrectable errors are usually the result of ECC memory detecting an error that it cannot correct, but they also crop up in the context of PCI cards failing during DMA writes, and during CAPI error events. There are several types of UE, and there are 3 places a UE can occur: Skiboot, the kernel, and userspace. For Skiboot errors, we have the facility to make some recoverable. For userspace, we can simply kill (SIGBUS) the affected process. We have no meaningful way to deal with UEs in kernel space or in unrecoverable sections of Skiboot. Currently, these unrecovered UEs fall through to machine_check_expection() in traps.c, which calls die(), which OOPSes and sends SIGBUS to the process. This sometimes allows us to stumble onwards. For example we've seen UEs kill the kernel eehd and khugepaged. However, the process killed could have held a lock, or it could have been a more important process, etc: we can no longer make any assertions about the state of the machine. Similarly if we see a UE in skiboot (and again we've seen this happen), we're not in a position where we can make any assertions about the state of the machine. Likewise, for unknown or unrecognised errors, we're not able to say anything about the state of the machine. Therefore, if we have an unrecovered MCE, the most appropriate thing to do is to panic. The second reason is that since `e784b6499d` ("powerpc/powernv: Invoke opal_cec_reboot2() on unrecoverable machine check errors."), we attempt a special OPAL reboot on an unhandled MCE. This is so the hardware can record error data for later debugging. The comments in that commit assert that we are heading down the panic path anyway. At the moment this is not always true. With UEs in kernel space, for instance, they are marked as recoverable by the hardware, so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't panic() but would simply die() and OOPS. It doesn't make sense to be staggering on if we've just tried to reboot: we should panic(). Explicitly panic() on unrecovered MCEs on PowerNV. Update the comments appropriately. This fixes some hangs following EEH events on cxlflash setups. Signed-off-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-09 08:07:19 +11:00
Colin Ian King	c13e1c05b2	powerpc/pseries/hvcserver: don't memset pi_buff if it is null pi_buff is being memset before it is sanity checked. Move the memset after the null pi_buff sanity check to avoid an oops. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-09 08:03:03 +11:00
Aneesh Kumar K.V	f78f7ed726	powerpc: Fix _ALIGN_* errors due to type difference. This avoid errors like unsigned int usize = 1 << 30; int size = 1 << 30; unsigned long addr = 64UL << 30 ; value = _ALIGN_DOWN(addr, usize); -> 0 value = _ALIGN_DOWN(addr, size); -> 0x1000000000 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-09 08:02:25 +11:00
Cyril Bur	fdf880a608	powerpc: Fix checkstop in native_hpte_clear() with lockdep native_hpte_clear() is called in real mode from two places: - Early in boot during htab initialisation if firmware assisted dump is active. - Late in the kexec path. In both contexts there is no need to disable interrupts are they are already disabled. Furthermore, locking around the tlbie() is only required for pre POWER5 hardware. On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre POWER5 hardware concurrent tlbie()s could result in deadlock. This code would only be executed at crashdump time, during which all bets are off, concurrent tlbie()s are unlikely and taking locks is unsafe therefore the best course of action is to simply do nothing. Concurrent tlbie()s are not possible in the first case as secondary CPUs have not come up yet. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-09 08:01:38 +11:00
Chris Metcalf	7a5692e6e5	arch/powerpc: provide zero_bytemask() for big-endian For some reason, only the little-endian flavor of powerpc provided the zero_bytemask() implementation. Reported-by: Michal Sojka <sojkam1@fel.cvut.cz> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>	2015-10-08 11:44:12 -04:00
Ingo Molnar	d3df65c198	Merge branch 'perf/urgent' into perf/core, before pulling new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-08 10:52:18 +02:00
Claudiu Manoil	70963d245e	powerpc: dts: p1022si: Add fsl,wake-on-filer for eTSEC Enable the "wake-on-filer" (aka. wake on user defined packet) wake on lan capability for the eTSEC ethernet nodes. Cc: Li Yang <leoli@freescale.com> Cc: Zhao Chenhui <chenhui.zhao@freescale.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-07 04:19:43 -07:00
Chris Metcalf	19c22f3a29	word-at-a-time.h: fix some Kbuild files arch/tile added word-at-a-time.h after the patch that added generic-y entries; the generic-y entry is now stale. arch/h8300 is newer than the generic-y patch for word-at-a-time.h, and needs a generic-y entry. arch/powerpc seems to have gotten a generic-y entry by mistake in the first patch; this change removes it. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>	2015-10-06 14:52:48 -04:00
Samuel Mendoza-Jonas	1b70386c99	powerpc/kexec: Wait 1s for secondaries to enter OPAL Always include a timeout when waiting for secondary cpus to enter OPAL in the kexec path, rather than only when crashing. Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-06 21:57:42 +11:00
Christophe Leroy	255e8e046c	powerpc/8xx: Shorten irq_chip name for the SIU show_interrupts() expects the irq_chip name to be max 8 characters otherwise everything get misaligned # cat /proc/interrupts CPU0 17: 0 CPM PIC 0 Level error 19: 0 MPC8XX SIU 15 Level tbint 20: 90 CPM PIC 4 Level cpm_uart 38: 29746 MPC8XX SIU 5 Level fs_enet-mac 39: 0 MPC8XX SIU 7 Level fs_enet-mac 47: 401 CPM PIC 5 Level fsl_spi 68: 1 MPC8XX SIU 2 Level phy_interrupt, phy_interrupt, phy_interrupt LOC: 7225485 Local timer interrupts for timer event device LOC: 9 Local timer interrupts for others SPU: 0 Spurious interrupts PMI: 0 Performance monitoring interrupts MCE: 0 Machine check exceptions Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-06 21:54:42 +11:00
Denis Kirjanov	cb2d3883c6	powerpc/msi: Free the bitmap if it was slab allocated During the MSI bitmap test on boot kmemleak spews the following trace: unreferenced object 0xc00000016e86c900 (size 64): comm "swapper/0", pid 1, jiffies 4294893173 (age 518.024s) hex dump (first 32 bytes): 00 00 01 ff 7f ff 7f 37 00 00 00 00 00 00 00 00 .......7........ ff ff ff ff ff ff ff ff 01 ff ff ff ff ff ff ff ................ backtrace: [<c00000000003eebc>] .zalloc_maybe_bootmem+0x3c/0x380 [<c000000000042d6c>] .msi_bitmap_alloc+0x3c/0xb0 [<c000000000a9aff8>] .msi_bitmap_selftest+0x30/0x2b4 [<c0000000000090f4>] .do_one_initcall+0xd4/0x270 [<c000000000a8e250>] .kernel_init_freeable+0x1a0/0x280 [<c000000000009b5c>] .kernel_init+0x1c/0x120 [<c000000000007fbc>] .ret_from_kernel_thread+0x58/0x9c Add a flag to msi_bitmap for tracking allocations from slab and memblock so we can properly free/handle memory in msi_bitmap_free(). Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> [mpe: Reword changelog & use bitmap_from_slab in the if] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-05 21:32:50 +11:00
Andy Shevchenko	06bacefcbd	powerpc/pseries: re-use code from of_helpers module The derive_parent() has similar semantics to what we have in newly introduced of_helpers module. The replacement reduces code base and propagates the actual error code to the caller. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-05 21:11:28 +11:00
Andy Shevchenko	a46d988409	powerpc/pseries: handle nodes without '/' In case we have node without '/' strrchr() returns NULL which might lead to crash. Replace strrchr() by kbasename() and modify condition to avoid such behaviour. Suggested-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-05 21:11:27 +11:00
Andy Shevchenko	a030e1e4bb	powerpc/pseries: replace kmalloc + strlcpy The helper kstrndup() will do the same in one line. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-05 21:11:26 +11:00
Andy Shevchenko	dc85aaed64	powerpc/pseries: fix a potential memory leak In case we have a full node name like /foo/bar and /foo is not found the parent_path left unfreed. So, free a memory before return to a caller. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-05 21:11:25 +11:00
Andy Shevchenko	948ad1acaf	powerpc/pseries: extract of_helpers module Extract a new module to share the code between other modules. There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-05 21:11:24 +11:00
Linus Torvalds	30c44659f4	Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile Pull strscpy string copy function implementation from Chris Metcalf. Chris sent this during the merge window, but I waffled back and forth on the pull request, which is why it's going in only now. The new "strscpy()" function is definitely easier to use and more secure than either strncpy() or strlcpy(), both of which are horrible nasty interfaces that have serious and irredeemable problems. strncpy() has a useless return value, and doesn't NUL-terminate an overlong result. To make matters worse, it pads a short result with zeroes, which is a performance disaster if you have big buffers. strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking the insane NUL padding, but having a differently broken return value which returns the original length of the source string. Which means that it will read characters past the count from the source buffer, and you have to trust the source to be properly terminated. It also makes error handling fragile, since the test for overflow is unnecessarily subtle. strscpy() avoids both these problems, guaranteeing the NUL termination (but not excessive padding) if the destination size wasn't zero, and making the overflow condition very obvious by returning -E2BIG. It also doesn't read past the size of the source, and can thus be used for untrusted source data too. So why did I waffle about this for so long? Every time we introduce a new-and-improved interface, people start doing these interminable series of trivial conversion patches. And every time that happens, somebody does some silly mistake, and the conversion patch to the improved interface actually makes things worse. Because the patch is mindnumbing and trivial, nobody has the attention span to look at it carefully, and it's usually done over large swatches of source code which means that not every conversion gets tested. So I'm pulling the strscpy() support because it is a better interface. But I will refuse to pull mindless conversion patches. Use this in places where it makes sense, but don't do trivial patches to fix things that aren't actually known to be broken. * 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: tile: use global strscpy() rather than private copy string: provide strscpy() Make asm/word-at-a-time.h available on all architectures	2015-10-04 16:31:13 +01:00
Daniel Borkmann	a91263d520	ebpf: migrate bpf_prog's flags to bitfield As we need to add further flags to the bpf_prog structure, lets migrate both bools to a bitfield representation. The size of the base structure (excluding insns) remains unchanged at 40 bytes. Add also tags for the kmemchecker, so that it doesn't throw false positives. Even in case gcc would generate suboptimal code, it's not being accessed in performance critical paths. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-03 05:02:39 -07:00
Christophe Jaillet	b6080db4f4	powerpc/nvram: Fix function name in some errors messages. 'nvram_create_os_partition' should be 'nvram_create_partition'. Use __func__ to have it right, as done elsewhere in this file. Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-02 22:55:05 +10:00
Christophe Jaillet	7d52318717	powerpc/nvram: Add missing kfree in error path If 'nvram_write_header' fails, then 'new_part' should be freed, otherwise, there is a memory leak. Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-02 22:54:55 +10:00
Michael Ellerman	2adc48a691	powerpc: Add ppc64le_defconfig Based directly on ppc64_defconfig using merge_config. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-01 16:52:03 +10:00
Aneesh Kumar K.V	65d3223a85	powerpc/mm: Add virt_to_pfn and use this instead of opencoding This add helper virt_to_pfn and remove the opencoded usage of the same. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-01 16:52:02 +10:00
Michael Neuling	c974809a26	powerpc/vdso: Avoid link stack corruption in __get_datapage() powerpc has a link register (lr) used for calling functions. We "bl <func>" to call a function, and "blr" to return back to the call site. The lr is only a single register, so if we call another function from inside this function (ie. nested calls), software must save away the lr on the software stack before calling the new function. Before returning (ie. before the "blr"), the lr is restored by software from the software stack. This makes branch prediction quite difficult for the processor as it will only know the branch target just before the "blr". To help with this, modern powerpc processors keep a (non-architected) hardware stack of lr called a "link stack". When a "bl <func>" is run, the lr is pushed onto this stack. When a "blr" is called, the branch predictor pops the lr value from the top of the link stack, and uses it to predict the branch target. Hence the processor pipeline knows a lot earlier the branch target. This works great but there are some cases where you call "bl" but without a matching "blr". Once such case is when trying to determine the program counter (which can't be read directly). Here you "bl+4; mflr" to get the program counter. If you do this, the link stack will get out of sync with reality, causing the branch predictor to mis-predict subsequent function returns. To avoid this, modern micro-architectures have a special case of bl. Using the form "bcl 20,31,+4", ensures the processor doesn't push to the link stack. The 32 and 64 bit variants of __get_datapage() use a "bl; mflr" to determine the loaded address of the VDSO. The current versions of these attempt to use this special bl variant. Unfortunately they use +8 rather than the required +4. Hence the current code results in the link stack getting out of sync with reality and hence the resulting performance degradation. This patch moves it to bcl+4 by moving __kernel_datapage_offset out of __get_datapage(). With this patch, running a gettimeofday() (which uses __get_datapage()) microbenchmark we get a decent bump in performance on POWER7/8. For the benchmark in tools/testing/selftests/powerpc/benchmarks/gettimeofday.c POWER8: 64bit gets ~4% improvement 32bit gets ~9% improvement POWER7: 64bit gets ~7% improvement Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: Aaron Sawdey <sawdey@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-01 16:52:02 +10:00
Michael Ellerman	26cd835ef8	powerpc/slb: Use a local to avoid multiple calls to get_slb_shadow() For no reason other than it looks ugly. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-01 16:52:01 +10:00
Anshuman Khandual	1d15010c34	powerpc/slb: Define an enum for the bolted indexes This patch defines macros for the three bolted SLB indexes we use. Switch the functions that take the indexes as an argument to use the enum. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-01 16:52:01 +10:00
Michael Ellerman	787b393c9f	powerpc/vdso: Emit GNU & SysV hashes Andy Lutomirski says: Some dynamic loaders may be slightly faster if a GNU hash is available. This is unlikely to have any measurable effect on the time it takes to resolve vdso symbols (since there are so few of them). In some contexts, it can be a win for a different reason: if every DSO has a GNU hash section, then libc can avoid calculating SysV hashes at all. Both musl and glibc appear to have this optimization. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-01 16:52:00 +10:00
Michael Ellerman	4fa9a3f6b6	powerpc/ps3: Remove unused os_area_db_id_video_mode This struct is unused, which is now a build error with gcc 6: error: 'os_area_db_id_video_mode' defined but not used There doesn't seem to be any good reason to keep it around so remove it, it's in the history if anyone needs it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-30 10:33:02 +10:00
Geoff Levand	336382c78b	powerpc/ps3: Refresh ps3_defconfig Refresh and remove obsolete CONFIG_EXT3_FS. Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-29 23:01:04 +10:00
Boqun Feng	e5e16d8f3e	powerpc: Kconfig: remove BE-only platforms from LE kernel build Currently, little endian is only supported on powernv and pseries, however, Kconfigs still allow us to include other platforms in a LE kernel, this may result in space wasting or even build error if some BE-only platforms always assume they are built for a BE kernel. So just modify the Kconfigs of BE-only platforms to remove them from being built for a LE kernel. For 32bit only platforms, nothing needs to be done, because CPU_LITTLE_ENDIAN depends on PPC64. For 64bit supported platforms, add CPU_BIG_ENDIAN to dependencies explicitly, so that these platforms will be disabled for LE [Suggested-by: Cédric Le Goater <clg@fr.ibm.com>]. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-29 22:57:00 +10:00
Michael Ellerman	6b98f2bec5	powerpc/configs: Re-enable CONFIG_SCSI_DH Commit `086b91d052` ("scsi_dh: integrate into the core SCSI code") changed CONFIG_SCSI_DH from tristate to bool. Our defconfigs have CONFIG_SCSI_DH=m, which the kconfig machinery warns us is invalid, but instead of converting it to =y it leaves it unset. This means we loose the CONFIG_SCSI_DH code and everything that depends on it. So convert the values in the defconfigs to =y. Fixes: `086b91d052` ("scsi_dh: integrate into the core SCSI code") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-29 20:34:03 +10:00
Ingo Molnar	6afc0c269c	Merge branch 'linus' into perf/core, to pick up fixes before applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-09-28 08:06:57 +02:00
Linus Torvalds	966966a630	PCI updates for v4.3: Resource management - Revert pci_read_bridge_bases() unification (Bjorn Helgaas) - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn Helgaas) MSI - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson) Renesas R-Car host bridge driver - Add R8A7794 support (Sergei Shtylyov) Miscellaneous - Fix devfn for VPD access through function 0 (Alex Williamson) - Use function 0 VPD only for identical functions (Alex Williamson) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWBUq2AAoJEFmIoMA60/r8wbwP/0D/+fKEPYJlB6hx1wLHpVk3 K//vEwH0RgA3v2X53QUoHg94gTYhSZLKX0zdAFshbphE0HCZ6AO3UO+/ZJ3cui6J PYvKOnhby2ErNotZqrs3DQIM8rGgl0ZVgoFQrAWEvwiHRHI/r2ArK/oR4PiBjxJT StYuJoTkZIlJyHXza6tvHDcWi+Jc8t8r0HC4Vs32BlaVBQM0SH3CMxHfhJw/Q9xP WHFif1sH0N+p7WDyHH71C1T8POOgXY73BsD2AC0se3lRYZ9SVkOVy9ECGUucx8F6 LDAuFelwRvW2Dr9kh38+5f8Xp155E+eZ6zRWW9/JlrUKVEtHhOFhtrRfDNKHuDCt B9ETrxDiSUFAdQ2weye9BK6aXK0CHF6YP3PCbvK77qFUUsN8csFSKktanKrFAbML CdjkVkEoeLHw+aXzyDg0pSBRZMQ24dTQDh7YqOFZGuEjCLPXOEQ8nitf0IzBB0KI 4QetT/QK3bKkgtVKTwPP+s9f4g+fA/oiwJ21ZTV9hi/9upywTa/umCUvH9Fmb8Fp VZeTzugSht0+ioXpaF/6/KO0Ccp/t5uAHYeuBBMqiHX7ks8DdnfPCwbWNRKkg25O Qy7Y8VnnOtesRCAqBq5y/hHlLUluMkjYpEblYFiD6HBWcjUh6xE6LlIO4mwnjqWI zjB3w7+0GOrvS7dBSx0N =f4c3 -----END PGP SIGNATURE----- Merge tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "These are fixes for things we merged for v4.3 (VPD, MSI, and bridge window management), and a new Renesas R8A7794 SoC device ID. Details: Resource management: - Revert pci_read_bridge_bases() unification (Bjorn Helgaas) - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn Helgaas) MSI: - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson) Renesas R-Car host bridge driver: - Add R8A7794 support (Sergei Shtylyov) Miscellaneous: - Fix devfn for VPD access through function 0 (Alex Williamson) - Use function 0 VPD only for identical functions (Alex Williamson)" * tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: rcar: Add R8A7794 support PCI: Use function 0 VPD for identical functions, regular VPD for others PCI: Fix devfn for VPD access through function 0 PCI/MSI: Fix MSI IRQ domains for VFs on virtual buses PCI: Clear IORESOURCE_UNSET when clipping a bridge window PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"	2015-09-25 11:16:53 -07:00
Linus Torvalds	b6d980f493	AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC bug fixes too. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWBSn7AAoJEL/70l94x66Dxd4H/RT6kWWj9x4grEYUkcJUDyK2 AXm7XcKQm04auwAic8Otr+ts/Qix/50kWmBe/TU0QLgqb8rj5Dj3yGFK6Z1y6mAz KvaxqMJd4tZGTqN0DDvC2ItEdzjfAdeJZo/FHXqPHVspG0G14T7STLna02LTBBEJ tNzY9qor8nFhg2fT2szqKaudUNgTqkCTpo57o2BrHE96SHG+m0WdpQCV1F5hPVpg Te0Pb7qX9xng5n3sQ7IV/t3QYbrza1ACwNQS9XJa0Yu6iEz7JdmVmzHQASK9ynn6 hUHhsNYGx4IsPjPtfJk2GroNaRDZL+VMzw07tfcOvPx8xkS9hS63pwzmSBqfLrM= =Ywqn -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC bug fixes too" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: disable halt_poll_ns as default for s390x KVM: x86: fix off-by-one in reserved bits check KVM: x86: use correct page table format to check nested page table reserved bits KVM: svm: do not call kvm_set_cr0 from init_vmcb KVM: x86: trap AMD MSRs for the TSeg base and mask KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store() KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs kvm: svm: reset mmu on VCPU reset	2015-09-25 10:51:40 -07:00
David Hildenbrand	920552b213	KVM: disable halt_poll_ns as default for s390x We observed some performance degradation on s390x with dynamic halt polling. Until we can provide a proper fix, let's enable halt_poll_ns as default only for supported architectures. Architectures are now free to set their own halt_poll_ns default value. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-25 10:31:30 +02:00
Michael Ellerman	793b8bf9ca	powerpc: Wire up sys_membarrier() The selftest passes on 64-bit LE & BE, and 32-bit. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-21 17:27:08 +10:00
Thomas Huth	3eb4ee6825	KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store() Access to the kvm->buses (like with the kvm_io_bus_read() and -write() functions) has to be protected via the kvm->srcu lock. The kvmppc_h_logical_ci_load() and -store() functions are missing this lock so far, so let's add it there, too. This fixes the problem that the kernel reports "suspicious RCU usage" when lock debugging is enabled. Cc: stable@vger.kernel.org # v4.1+ Fixes: `99342cf804` Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-09-21 09:05:15 +10:00
Gautham R. Shenoy	7e022e717f	KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit In guest_exit_cont we call kvmhv_commence_exit which expects the trap number as the argument. However r3 doesn't contain the trap number at this point and as a result we would be calling the function with a spurious trap number. Fix this by copying r12 into r3 before calling kvmhv_commence_exit as r12 contains the trap number. Cc: stable@vger.kernel.org # v4.1+ Fixes: `eddb60fb14` Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-09-21 09:05:12 +10:00
Paul Mackerras	5fc3e64f94	KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs This fixes a bug which results in stale vcore pointers being left in the per-cpu preempted vcore lists when a VM is destroyed. The result of the stale vcore pointers is usually either a crash or a lockup inside collect_piggybacks() when another VM is run. A typical lockup message looks like: [ 472.161074] NMI watchdog: BUG: soft lockup - CPU#24 stuck for 22s! [qemu-system-ppc:7039] [ 472.161204] Modules linked in: kvm_hv kvm_pr kvm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ses enclosure shpchp rtc_opal i2c_opal powernv_rng binfmt_misc dm_service_time scsi_dh_alua radeon i2c_algo_bit drm_kms_helper ttm drm tg3 ptp pps_core cxgb3 ipr i2c_core mdio dm_multipath [last unloaded: kvm_hv] [ 472.162111] CPU: 24 PID: 7039 Comm: qemu-system-ppc Not tainted 4.2.0-kvm+ #49 [ 472.162187] task: c000001e38512750 ti: c000001e41bfc000 task.ti: c000001e41bfc000 [ 472.162262] NIP: c00000000096b094 LR: c00000000096b08c CTR: c000000000111130 [ 472.162337] REGS: c000001e41bff520 TRAP: 0901 Not tainted (4.2.0-kvm+) [ 472.162399] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24848844 XER: 00000000 [ 472.162588] CFAR: c00000000096b0ac SOFTE: 1 GPR00: c000000000111170 c000001e41bff7a0 c00000000127df00 0000000000000001 GPR04: 0000000000000003 0000000000000001 0000000000000000 0000000000874821 GPR08: c000001e41bff8e0 0000000000000001 0000000000000000 d00000000efde740 GPR12: c000000000111130 c00000000fdae400 [ 472.163053] NIP [c00000000096b094] _raw_spin_lock_irqsave+0xa4/0x130 [ 472.163117] LR [c00000000096b08c] _raw_spin_lock_irqsave+0x9c/0x130 [ 472.163179] Call Trace: [ 472.163206] [c000001e41bff7a0] [c000001e41bff7f0] 0xc000001e41bff7f0 (unreliable) [ 472.163295] [c000001e41bff7e0] [c000000000111170] __wake_up+0x40/0x90 [ 472.163375] [c000001e41bff830] [d00000000efd6fc0] kvmppc_run_core+0x1240/0x1950 [kvm_hv] [ 472.163465] [c000001e41bffa30] [d00000000efd8510] kvmppc_vcpu_run_hv+0x5a0/0xd90 [kvm_hv] [ 472.163559] [c000001e41bffb70] [d00000000e9318a4] kvmppc_vcpu_run+0x44/0x60 [kvm] [ 472.163653] [c000001e41bffba0] [d00000000e92e674] kvm_arch_vcpu_ioctl_run+0x64/0x170 [kvm] [ 472.163745] [c000001e41bffbe0] [d00000000e9263a8] kvm_vcpu_ioctl+0x538/0x7b0 [kvm] [ 472.163834] [c000001e41bffd40] [c0000000002d0f50] do_vfs_ioctl+0x480/0x7c0 [ 472.163910] [c000001e41bffde0] [c0000000002d1364] SyS_ioctl+0xd4/0xf0 [ 472.163986] [c000001e41bffe30] [c000000000009260] system_call+0x38/0xd0 [ 472.164060] Instruction dump: [ 472.164098] ebc1fff0 ebe1fff8 7c0803a6 4e800020 60000000 60000000 60420000 8bad02e2 [ 472.164224] 7fc3f378 4b6a57c1 60000000 7c210b78 <e92d0000> 89290009 792affe3 40820070 The bug is that kvmppc_run_vcpu does not correctly handle the case where a vcpu task receives a signal while its guest vcpu is executing in the guest as a result of being piggy-backed onto the execution of another vcore. In that case we need to wait for the vcpu to finish executing inside the guest, and then remove this vcore from the preempted vcores list. That way, we avoid leaving this vcpu's vcore on the preempted vcores list when the vcpu gets interrupted. Fixes: `ec25716508` Reported-by: Thomas Huth <thuth@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-09-21 09:00:47 +10:00
Linus Torvalds	3ae839454e	Mostly stable material, a lot of ARM fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJV+/ucAAoJEL/70l94x66DV8YH/1KDym/1GJ+/Br/YkHZnM53l 3Q0PwSLu9cNcIL9lUuDLwGTaVj+y8ud1Hjr/uzvKwivktmUYVZhkdtnZmnanvGOM qKB9K3nFXCPx8uqy8Dn7fOwEKcg9FmDOTTkWy13HDnXO+V4crSVVt+rPw+6FUMld NV5tYdw9Lu7y3XrveDebPWaPtyDL7OAagzmeK47eMffxG7X9Hf1H2aT7HueRi7x/ SkLIe3gmiOWmHVJDPE9TOmFYIj19gywDFysKes1gdVJLVUIXiELMT7SrvAYnToVB zISIEj7Zx4SINPxpf2dUn8REm7NsmJY+PffLIl/Nv+ozGggFQGFH0SMZ08p0bxw= =tfmn -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "Mostly stable material, a lot of ARM fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) sched: access local runqueue directly in single_task_running arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS' arm64: KVM: Remove all traces of the ThumbEE registers arm: KVM: Disable virtual timer even if the guest is not using it arm64: KVM: Disable virtual timer even if the guest is not using it arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources KVM: s390: Replace incorrect atomic_or with atomic_andnot arm: KVM: Fix incorrect device to IPA mapping arm64: KVM: Fix user access for debug registers KVM: vmx: fix VPID is 0000H in non-root operation KVM: add halt_attempted_poll to VCPU stats kvm: fix zero length mmio searching kvm: fix double free for fast mmio eventfd kvm: factor out core eventfd assign/deassign logic kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd KVM: make the declaration of functions within 80 characters KVM: arm64: add workaround for Cortex-A57 erratum #852523 KVM: fix polling for guest halt continued even if disable it arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores arm64: KVM: set {v,}TCR_EL2 RES1 bits ...	2015-09-18 09:23:08 -07:00
Linus Torvalds	fadb97b089	Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "This is a rather large update post rc1 due to the final steps of cleanups and API changes which had to wait for the preparatory patches to hit your tree. - Regression fixes for ARM GIC irqchips - Regression fixes and lockdep anotations for renesas irq chips - The leftovers of the cleanup and preparatory patches which have been ignored by maintainers - Final conversions of the newly merged users of obsolete APIs - Final removal of obsolete APIs - Final removal of ARM artifacts which had been introduced during the conversion of ARM to the generic interrupt code. - Final split of the irq_data into chip specific and common data to reflect the needs of hierarchical irq domains. - Treewide removal of the first argument of interrupt flow handlers, i.e. the irq number, which is not used by the majority of handlers and simple to retrieve from the other argument the irq descriptor. - A few comment updates and build warning fixes" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) arm64: Remove ununsed set_irq_flags ARM: Remove ununsed set_irq_flags sh: Kill off set_irq_flags usage irqchip: Kill off set_irq_flags usage gpu/drm: Kill off set_irq_flags usage genirq: Remove irq argument from irq flow handlers genirq: Move field 'msi_desc' from irq_data into irq_common_data genirq: Move field 'affinity' from irq_data into irq_common_data genirq: Move field 'handler_data' from irq_data into irq_common_data genirq: Move field 'node' from irq_data into irq_common_data irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag genirq: Provide IRQD_FORWARDED_TO_VCPU status flag genirq: Simplify irq_data_to_desc() genirq: Remove __irq_set_handler_locked() pinctrl/pistachio: Use irq_set_handler_locked gpio: vf610: Use irq_set_handler_locked powerpc/mpc8xx: Use irq_set_handler_locked() powerpc/ipic: Use irq_set_handler_locked() powerpc/cpm2: Use irq_set_handler_locked() ...	2015-09-18 08:11:42 -07:00
Linus Torvalds	f240bdd2a5	powerpc fixes for 4.3 - Fix 32-bit TCE table init in kdump kernel from Nish - Fix kdump with non-power-of-2 crashkernel= from Nish - Abort cxl_pci_enable_device_hook() if PCI channel is offline from Andrew - Fix to release DRC when configure_connector() fails from Bharata - Wire up sys_userfaultfd() - Fix race condition in tearing down MSI interrupts from Paul - Fix unbalanced pci_dev_get() in cxl_probe() from Daniel - Fix cxl build failure due to -Wunused-variable gcc behaviour change from Ian - Tell the toolchain to use ABI v2 when building an LE boot wrapper from Benh - Fix THP to recompute hash value after a failed update from Aneesh - 32-bit memcpy/memset: only use dcbz once cache is enabled from Christophe -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV+h5EAAoJEFHr6jzI4aWADboP/3jc6vUtOmFZ1GBGwXrftOc0 PGtkTtYnGbaNsp+ZBl39Y+CFsSfhFVaDgylm/8G5NMKZaSBVdcFJwfU1w6Ymn49R nZkAT5PC9KgT5RTuRTZ3DO/Y2RC9vg2T9pXjEn8NGYcV8GgUkc3dZAn48S3AFgnV 4jQI5sbxvwU12XkCUn+DkETh13g3gLYtRxwehBu/S/ovED5iNHKJwnXRzxyAA969 dARNriSeyLBVMLamJ+rJB1S5hVTZTMbughFVVFbgriyIGuC/C1g9b9GN86dCGS6w T6VrKveK/iVCLUB16KV8+inbfvUrXItOxhGJWPHw9uAJGLZTz5G+yLHRRPX8onyC pgDesJDDpP/P7sAnKto3tF1Vzi7lwVtVPC1dT1Fc9VAWJGPYC/d16EKGIpNqqlnc mAIJ7wcI5c/HxvqXR2rdRV6fMer+aY7utwMsh4o/gDs7ArQUcuCrOKSW0jvHGmyr MumARXnUGDwPnGD8IfYI2vDOvwisv9g6XACwsM+pi499SfowaiLuK3utsMccagGZ INFtaqS7gpcj8TTu3kymw1TbKW/tqG9T81RRZ0rFH1q3aSfvwQ9QvsXctSeqOl/n lkxR3Mk0CT0oupXNKV6pjqsRwLroA0AF5tGxuw4ap1Rt4i8G7WHaPgRp7SPZxtkR qVBryfK5bKqYtWp4ztVK =Dgn5 -----END PGP SIGNATURE----- Merge tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix 32-bit TCE table init in kdump kernel from Nish - Fix kdump with non-power-of-2 crashkernel= from Nish - Abort cxl_pci_enable_device_hook() if PCI channel is offline from Andrew - Fix to release DRC when configure_connector() fails from Bharata - Wire up sys_userfaultfd() - Fix race condition in tearing down MSI interrupts from Paul - Fix unbalanced pci_dev_get() in cxl_probe() from Daniel - Fix cxl build failure due to -Wunused-variable gcc behaviour change from Ian - Tell the toolchain to use ABI v2 when building an LE boot wrapper from Benh - Fix THP to recompute hash value after a failed update from Aneesh - 32-bit memcpy/memset: only use dcbz once cache is enabled from Christophe * tag 'powerpc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc32: memset: only use dcbz once cache is enabled powerpc32: memcpy: only use dcbz once cache is enabled powerpc/mm: Recompute hash value after a failed update powerpc/boot: Specify ABI v2 when building an LE boot wrapper cxl: Fix build failure due to -Wunused-variable behaviour change cxl: Fix unbalanced pci_dev_get in cxl_probe powerpc/MSI: Fix race condition in tearing down MSI interrupts powerpc: Wire up sys_userfaultfd() powerpc/pseries: Release DRC when configure_connector fails cxl: abort cxl_pci_enable_device_hook() if PCI channel is offline powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel= powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel	2015-09-18 08:01:06 -07:00
Marc Zyngier	705a7b474e	powerpc/PCI: Fix lookup of linux,pci-probe-only property When find_and_init_phbs() looks for the probe-only property, it seems to trust the firmware to be correctly written, and assumes that there is a parameter to the property. It is conceivable that the firmware could not be that perfect, and it could expose this property naked (at least one arm64 platform seems to exhibit this exact behaviour). The setup code the ends up making a decision based on whatever the property pointer points to, which is likely to be junk. Instead, switch to the common of_pci.c implementation that doesn't suffer from this problem and ignore the property if the firmware couldn't make up its mind. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-17 12:21:02 -05:00
LEROY Christophe	400c47d81c	powerpc32: memset: only use dcbz once cache is enabled memset() uses instruction dcbz to speed up clearing by not wasting time loading cache line with data that will be overwritten. Some platform like mpc52xx do no have cache active at startup and can therefore not use memset(). Allthough no part of the code explicitly uses memset(), GCC may make calls to it. This patch modifies memset() such that at startup, memset() unconditionally skip the optimised bloc that uses dcbz instruction. Once the initial MMU is set up, in machine_init() we patch memset() by replacing this inconditional jump by a NOP Tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-17 10:36:53 +10:00
LEROY Christophe	1cd03890ea	powerpc32: memcpy: only use dcbz once cache is enabled memcpy() uses instruction dcbz to speed up copy by not wasting time loading cache line with data that will be overwritten. Some platform like mpc52xx do no have cache active at startup and can therefore not use memcpy(). Allthough no part of the code explicitly uses memcpy(), GCC makes calls to it. This patch modifies memcpy() such that at startup, memcpy() unconditionally jumps to generic_memcpy() which doesn't use the dcbz instruction. Once the initial MMU is set up, in machine_init() we patch memcpy() by replacing this inconditional jump by a NOP Reported-by: Michal Sojka <sojkam1@fel.cvut.cz> Tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-17 10:36:44 +10:00
Thomas Gleixner	bd0b9ac405	genirq: Remove irq argument from irq flow handlers Most interrupt flow handlers do not use the irq argument. Those few which use it can retrieve the irq number from the irq descriptor. Remove the argument. Search and replace was done with coccinelle and some extra helper scripts around it. Thanks to Julia for her help! Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jiang Liu <jiang.liu@linux.intel.com>	2015-09-16 15:47:51 +02:00
Thomas Gleixner	9ca86b204b	powerpc/mpc8xx: Use irq_set_handler_locked() Use irq_set_handler_locked() as it avoids a redundant lookup of the irq descriptor. Search and replacement was done with coccinelle: @@ struct irq_data *d; expression E1; @@ -__irq_set_handler_locked(d->irq, E1); +irq_set_handler_locked(d, E1); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org	2015-09-16 15:43:11 +02:00
Thomas Gleixner	9758a7b0e5	powerpc/ipic: Use irq_set_handler_locked() Use irq_set_handler_locked() as it avoids a redundant lookup of the irq descriptor. Search and replacement was done with coccinelle: @@ struct irq_data *d; expression E1; @@ -__irq_set_handler_locked(d->irq, E1); +irq_set_handler_locked(d, E1); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anton Blanchard <anton@samba.org> Cc: linuxppc-dev@lists.ozlabs.org	2015-09-16 15:43:11 +02:00
Thomas Gleixner	e9e879a3d6	powerpc/cpm2: Use irq_set_handler_locked() Use irq_set_handler_locked() as it avoids a redundant lookup of the irq descriptor. Search and replacement was done with coccinelle: @@ struct irq_data *d; expression E1; @@ -__irq_set_handler_locked(d->irq, E1); +irq_set_handler_locked(d, E1); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org	2015-09-16 15:43:10 +02:00
Thomas Gleixner	6b83bd9414	powerpc/mpc52xx: Use irq_set_handler_locked() Use irq_set_handler_locked() as it avoids a redundant lookup of the irq descriptor. Search and replacement was done with coccinelle: @@ struct irq_data *d; expression E1; @@ -__irq_set_handler_locked(d->irq, E1); +irq_set_handler_locked(d, E1); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Anatolij Gustschin <agust@denx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org	2015-09-16 15:43:10 +02:00
Aneesh Kumar K.V	36b35d5d80	powerpc/mm: Recompute hash value after a failed update If we had secondary hash flag set, we ended up modifying hash value in the updatepp code path. Hence with a failed updatepp we will be using a wrong hash value for the following hash insert. Fix this by recomputing hash before insert. Without this patch we can end up with using wrong slot number in linux pte. That can result in us missing an hash pte update or invalidate which can cause memory corruption or even machine check. Fixes: `6d492ecc64` ("powerpc/THP: Add code to handle HPTE faults for hugepages") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-16 22:06:03 +10:00
Benjamin Herrenschmidt	655471f54c	powerpc/boot: Specify ABI v2 when building an LE boot wrapper The kernel does it, not the boot wrapper, which breaks with some cross compilers that still default to ABI v1. Fixes: `147c05168f` ("powerpc/boot: Add support for 64bit little endian wrapper") Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-16 22:05:55 +10:00
Paolo Bonzini	62bea5bff4	KVM: add halt_attempted_poll to VCPU stats This new statistic can help diagnosing VCPUs that, for any reason, trigger bad behavior of halt_poll_ns autotuning. For example, say halt_poll_ns = 480000, and wakeups are spaced exactly like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes 10+20+40+80+160+320+480 = 1110 microseconds out of every 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then is consuming about 30% more CPU than it would use without polling. This would show as an abnormally high number of attempted polling compared to the successful polls. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com< Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-16 12:17:00 +02:00
Bjorn Helgaas	237865f195	PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code" Revert `dff22d2054` ("PCI: Call pci_read_bridge_bases() from core instead of arch code"). Reading PCI bridge windows is not arch-specific in itself, but there is PCI core code that doesn't work correctly if we read them too early. For example, Hannes found this case on an ARM Freescale i.mx6 board: pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff] pci 0000:00:00.0: PCI bridge to [bus 01-ff] pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window) pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000] pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100] The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs 0x204100 of space, and mem windows are megabyte-aligned. Bus sizing can increase a bridge window size, but never decrease it (see `d65245c329` ("PCI: don't shrink bridge resources")). Prior to `dff22d2054`, ARM didn't read bridge windows at all, so the "original size" was zero, and we assigned a 3MB window. After `dff22d2054`, we read the bridge windows before sizing the bus. The firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since we never decrease the size, we kept 16MB even though we only needed 3MB. But 16MB doesn't fit in the host bridge aperture, so we failed to assign space for the window and the downstream devices. I think this is a defect in the PCI core: we shouldn't rely on the firmware to assign sensible windows. Ray reported a similar problem, also on ARM, with Broadcom iProc. Issues like this are too hard to fix right now, so revert `dff22d2054`. Reported-by: Hannes <oe5hpm@gmail.com> Reported-by: Ray Jui <rjui@broadcom.com> Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.com Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2015-09-15 13:18:04 -05:00
Jiang Liu	da92b4eb7e	powerpc, irq: Use access helper irq_data_get_affinity_mask() Use access helper irq_data_get_affinity_mask() so we can move the affinity mask to irq_common_data. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1433145945-789-25-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-09-15 17:06:28 +02:00
Thomas Gleixner	391de7f9ef	powerpc/cell: Prepare irq handler for irq argument removal The irq argument of most interrupt flow handlers is unused or merily used instead of a local variable. The handlers which need the irq argument can retrieve the irq number from the irq descriptor. Search and update was done with coccinelle and the invaluable help of Julia Lawall. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linuxppc-dev@lists.ozlabs.org	2015-09-14 10:30:05 +02:00
Thomas Gleixner	0a0dbd9258	powerpc/85xx: Prepare irq handlers for irq argument removal The irq argument of most interrupt flow handlers is unused or merily used instead of a local variable. The handlers which need the irq argument can retrieve the irq number from the irq descriptor. Search and update was done with coccinelle and the invaluable help of Julia Lawall. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Scott Wood <scottwood@freescale.com> Cc: linuxppc-dev@lists.ozlabs.org	2015-09-14 10:30:05 +02:00
Thomas Gleixner	5aac2d3368	powerpc/mpc5121_ads_cpld: Prepare irq handler for irq argument removal The irq argument of most interrupt flow handlers is unused or merily used instead of a local variable. The handlers which need the irq argument can retrieve the irq number from the irq descriptor. Search and update was done with coccinelle and the invaluable help of Julia Lawall. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Anatolij Gustschin <agust@denx.de> Cc: linuxppc-dev@lists.ozlabs.org	2015-09-14 10:30:04 +02:00
Sukadev Bhattiprolu	8f3e5684d3	perf/core: Drop PERF_EVENT_TXN We currently use PERF_EVENT_TXN flag to determine if we are in the middle of a transaction. If in a transaction, we defer the schedulability checks from pmu->add() operation to the pmu->commit() operation. Now that we have "transaction types" (PERF_PMU_TXN_ADD, PERF_PMU_TXN_READ) we can use the type to determine if we are in a transaction and drop the PERF_EVENT_TXN flag. When PERF_EVENT_TXN is dropped, the cpuhw->group_flag on some architectures becomes unused, so drop that field as well. This is an extension of the Powerpc patch from Peter Zijlstra to s390, Sparc and x86 architectures. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1441336073-22750-11-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-09-13 11:27:30 +02:00
Sukadev Bhattiprolu	88a486132d	powerpc, perf/powerpc/hv-24x7: Use PMU_TXN_READ interface The 24x7 counters in Powerpc allow monitoring a large number of counters simultaneously. They also allow reading several counters in a single HCALL so we can get a more consistent snapshot of the system. Use the PMU's transaction interface to monitor and read several event counters at once. The idea is that users can group several 24x7 events into a single group of events. We use the following logic to submit the group of events to the PMU and read the values: pmu->start_txn() // Initialize before first event for each event in group pmu->read(event); // Queue each event to be read pmu->commit_txn() // Read/update all queuedcounters The ->commit_txn() also updates the event counts in the respective perf_event objects. The perf subsystem can then directly get the event counts from the perf_event and can avoid submitting a new ->read() request to the PMU. Thanks to input from Peter Zijlstra. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1441336073-22750-10-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-09-13 11:27:29 +02:00
Sukadev Bhattiprolu	fbbe070115	perf/core: Add a 'flags' parameter to the PMU transactional interfaces Currently, the PMU interface allows reading only one counter at a time. But some PMUs like the 24x7 counters in Power, support reading several counters at once. To leveage this functionality, extend the transaction interface to support a "transaction type". The first type, PERF_PMU_TXN_ADD, refers to the existing transactions, i.e. used to _schedule_ all the events on the PMU as a group. A second transaction type, PERF_PMU_TXN_READ, will be used in a follow-on patch, by the 24x7 counters to read several counters at once. Extend the transaction interfaces to the PMU to accept a 'txn_flags' parameter and use this parameter to ignore any transactions that are not of type PERF_PMU_TXN_ADD. Thanks to Peter Zijlstra for his input. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [peterz: s390 compile fix] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1441336073-22750-3-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-09-13 11:27:25 +02:00
Linus Torvalds	33e247c7e5	Merge branch 'akpm' (patches from Andrew) Merge third patch-bomb from Andrew Morton: - even more of the rest of MM - lib/ updates - checkpatch updates - small changes to a few scruffy filesystems - kmod fixes/cleanups - kexec updates - a dma-mapping cleanup series from hch * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits) dma-mapping: consolidate dma_set_mask dma-mapping: consolidate dma_supported dma-mapping: cosolidate dma_mapping_error dma-mapping: consolidate dma_{alloc,free}_noncoherent dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} mm: use vma_is_anonymous() in create_huge_pmd() and wp_huge_pmd() mm: make sure all file VMAs have ->vm_ops set mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff() mm: mark most vm_operations_struct const namei: fix warning while make xmldocs caused by namei.c ipc: convert invalid scenarios to use WARN_ON zlib_deflate/deftree: remove bi_reverse() lib/decompress_unlzma: Do a NULL check for pointer lib/decompressors: use real out buf size for gunzip with kernel fs/affs: make root lookup from blkdev logical size sysctl: fix int -> unsigned long assignments in INT_MIN case kexec: export KERNEL_IMAGE_SIZE to vmcoreinfo kexec: align crash_notes allocation to make it be inside one physical page kexec: remove unnecessary test in kimage_alloc_crash_control_pages() kexec: split kexec_load syscall from kexec core code ...	2015-09-10 18:19:42 -07:00
Linus Torvalds	519f526d39	ARM: - Full debug support for arm64 - Active state switching for timer interrupts - Lazy FP/SIMD save/restore for arm64 - Generic ARMv8 target PPC: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8 x86: - Compiler warnings Generic: - Adaptive polling for guest halt -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJV7qd/AAoJEL/70l94x66DDBcH/2OLomKHjDOGXqJ/dpkqf4UU FYI1pVjs2zP4z3L7RYV/DeuEsD6XaWzS7EXQOS3mcb9d8GWahPrdofeVmpmhg/8y jmkuUEFHl2Ut6imk8qDlG3m42c86Mk8/1k38l1bp8S3lL0/Q7IyADyYAlHdwzpOx yEyOAE4VU4n+VyQH5dbnzc12QRTeHfRQc/dI3eQq238gf37SF/1qzOzeLIdbEa+N DCzqQ8SExbctiRaLzCY5Ogan+unZBQbFfhrDrUSryywrzo/8WRFVmbjuf5O5Ucxa +UTLMvmm1YgxvBvWhlcmA+HSzSVeWNvaHQ9illgE5+74G5CzaD2ukurmoz/+r+A= =XtrL -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull more kvm updates from Paolo Bonzini: "ARM: - Full debug support for arm64 - Active state switching for timer interrupts - Lazy FP/SIMD save/restore for arm64 - Generic ARMv8 target PPC: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8 x86: - Compiler warnings Generic: - Adaptive polling for guest halt" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (49 commits) kvm: irqchip: fix memory leak kvm: move new trace event outside #ifdef CONFIG_KVM_ASYNC_PF KVM: trace kvm_halt_poll_ns grow/shrink KVM: dynamic halt-polling KVM: make halt_poll_ns per-vCPU Silence compiler warning in arch/x86/kvm/emulate.c kvm: compile process_smi_save_seg_64() only for x86_64 KVM: x86: avoid uninitialized variable warning KVM: PPC: Book3S: Fix typo in top comment about locking KVM: PPC: Book3S: Fix size of the PSPB register KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is set KVM: PPC: Book3S HV: Fix race in starting secondary threads KVM: PPC: Book3S: correct width in XER handling KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation KVM: PPC: Book3S HV: Fix preempted vcore list locking KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD KVM: PPC: Book3S HV: Fix bug in dirty page tracking KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8 KVM: PPC: Book3S HV: Make use of unused threads when running guests ...	2015-09-10 16:42:49 -07:00
Christoph Hellwig	452e06af1f	dma-mapping: consolidate dma_set_mask Almost everyone implements dma_set_mask the same way, although some time that's hidden in ->set_dma_mask methods. This patch consolidates those into a common implementation that either calls ->set_dma_mask if present or otherwise uses the default implementation. Some architectures used to only call ->set_dma_mask after the initial checks, and those instance have been fixed to do the full work. h8300 implemented dma_set_mask bogusly as a no-ops and has been fixed. Unfortunately some architectures overload unrelated semantics like changing the dma_ops into it so we still need to allow for an architecture override for now. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Christoph Hellwig	ee196371d5	dma-mapping: consolidate dma_supported Most architectures just call into ->dma_supported, but some also return 1 if the method is not present, or 0 if no dma ops are present (although that should never happeb). Consolidate this more broad version into common code. Also fix h8300 which inorrectly always returned 0, which would have been a problem if it's dma_set_mask implementation wasn't a similarly buggy noop. As a few architectures have much more elaborate implementations, we still allow for arch overrides. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Christoph Hellwig	efa21e432c	dma-mapping: cosolidate dma_mapping_error Currently there are three valid implementations of dma_mapping_error: (1) call ->mapping_error (2) check for a hardcoded error code (3) always return 0 This patch provides a common implementation that calls ->mapping_error if present, then checks for DMA_ERROR_CODE if defined or otherwise returns 0. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Christoph Hellwig	1e8937526e	dma-mapping: consolidate dma_{alloc,free}_noncoherent Most architectures do not support non-coherent allocations and either define dma_{alloc,free}_noncoherent to their coherent versions or stub them out. Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips implements them directly. This patch moves the Openrisc version to common code, and handles the DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance. Note that actual non-coherent allocations require a dma_cache_sync implementation, so if non-coherent allocations didn't work on an architecture before this patch they still won't work after it. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Christoph Hellwig	6894258eda	dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Dave Young	2965faa5e0	kexec: split kexec_load syscall from kexec core code There are two kexec load syscalls, kexec_load another and kexec_file_load. kexec_file_load has been splited as kernel/kexec_file.c. In this patch I split kexec_load syscall code to kernel/kexec.c. And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and use kexec_file_load only, or vice verse. The original requirement is from Ted Ts'o, he want kexec kernel signature being checked with CONFIG_KEXEC_VERIFY_SIG enabled. But kexec-tools use kexec_load syscall can bypass the checking. Vivek Goyal proposed to create a common kconfig option so user can compile in only one syscall for loading kexec kernel. KEXEC/KEXEC_FILE selects KEXEC_CORE so that old config files still work. Because there's general code need CONFIG_KEXEC_CORE, so I updated all the architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects KEXEC_CORE in arch Kconfig. Also updated general kernel code with to kexec_load syscall. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Dave Young <dyoung@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Josh Boyer <jwboyer@fedoraproject.org> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Paul Mackerras	e297c939b7	powerpc/MSI: Fix race condition in tearing down MSI interrupts This fixes a race which can result in the same virtual IRQ number being assigned to two different MSI interrupts. The most visible consequence of that is usually a warning and stack trace from the sysfs code about an attempt to create a duplicate entry in sysfs. The race happens when one CPU (say CPU 0) is disposing of an MSI while another CPU (say CPU 1) is setting up an MSI. CPU 0 calls (for example) pnv_teardown_msi_irqs(), which calls msi_bitmap_free_hwirqs() to indicate that the MSI (i.e. its hardware IRQ number) is no longer in use. Then, before CPU 0 gets to calling irq_dispose_mapping() to free up the virtal IRQ number, CPU 1 comes in and calls msi_bitmap_alloc_hwirqs() to allocate an MSI, and gets the same hardware IRQ number that CPU 0 just freed. CPU 1 then calls irq_create_mapping() to get a virtual IRQ number, which sees that there is currently a mapping for that hardware IRQ number and returns the corresponding virtual IRQ number (which is the same virtual IRQ number that CPU 0 was using). CPU 0 then calls irq_dispose_mapping() and frees that virtual IRQ number. Now, if another CPU comes along and calls irq_create_mapping(), it is likely to get the virtual IRQ number that was just freed, resulting in the same virtual IRQ number apparently being used for two different hardware interrupts. To fix this race, we just move the call to msi_bitmap_free_hwirqs() to after the call to irq_dispose_mapping(). Since virq_to_hw() doesn't work for the virtual IRQ number after irq_dispose_mapping() has been called, we need to call it before irq_dispose_mapping() and remember the result for the msi_bitmap_free_hwirqs() call. The pattern of calling msi_bitmap_free_hwirqs() before irq_dispose_mapping() appears in 5 places under arch/powerpc, and appears to have originated in commit `05af7bd2d7` ("[POWERPC] MPIC U3/U4 MSI backend") from 2007. Fixes: `05af7bd2d7` ("[POWERPC] MPIC U3/U4 MSI backend") Cc: stable@vger.kernel.org # v2.6.22+ Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-10 17:27:08 +10:00
Michael Ellerman	b855d45dc3	powerpc: Wire up sys_userfaultfd() The selftest passes on 64-bit LE and BE. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-09 12:51:15 +10:00
Linus Torvalds	f6f7a63692	Merge branch 'akpm' (patches from Andrew) Merge second patch-bomb from Andrew Morton: "Almost all of the rest of MM. There was an unusually large amount of MM material this time" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (141 commits) zpool: remove no-op module init/exit mm: zbud: constify the zbud_ops mm: zpool: constify the zpool_ops mm: swap: zswap: maybe_preload & refactoring zram: unify error reporting zsmalloc: remove null check from destroy_handle_cache() zsmalloc: do not take class lock in zs_shrinker_count() zsmalloc: use class->pages_per_zspage zsmalloc: consider ZS_ALMOST_FULL as migrate source zsmalloc: partial page ordering within a fullness_list zsmalloc: use shrinker to trigger auto-compaction zsmalloc: account the number of compacted pages zsmalloc/zram: introduce zs_pool_stats api zsmalloc: cosmetic compaction code adjustments zsmalloc: introduce zs_can_compact() function zsmalloc: always keep per-class stats zsmalloc: drop unused variable `nr_to_migrate' mm/memblock.c: fix comment in __next_mem_range() mm/page_alloc.c: fix type information of memoryless node memory-hotplug: fix comments in zone_spanned_pages_in_node() and zone_spanned_pages_in_node() ...	2015-09-08 17:52:23 -07:00
Vlastimil Babka	96db800f5d	mm: rename alloc_pages_exact_node() to __alloc_pages_node() alloc_pages_exact_node() was introduced in commit `6484eb3e2a` ("page allocator: do not check NUMA node ID when the caller knows the node is valid") as an optimized variant of alloc_pages_node(), that doesn't fallback to current node for nid == NUMA_NO_NODE. Unfortunately the name of the function can easily suggest that the allocation is restricted to the given node and fails otherwise. In truth, the node is only preferred, unless __GFP_THISNODE is passed among the gfp flags. The misleading name has lead to mistakes in the past, see for example commits `5265047ac3` ("mm, thp: really limit transparent hugepage allocation to local node") and `b360edb43f` ("mm, mempolicy: migrate_to_node should only migrate to node"). Another issue with the name is that there's a family of alloc_pages_exact*() functions where 'exact' means exact size (instead of page order), which leads to more confusion. To prevent further mistakes, this patch effectively renames alloc_pages_exact_node() to __alloc_pages_node() to better convey that it's an optimized variant of alloc_pages_node() not intended for general usage. Both functions get described in comments. It has been also considered to really provide a convenience function for allocations restricted to a node, but the major opinion seems to be that __GFP_THISNODE already provides that functionality and we shouldn't duplicate the API needlessly. The number of users would be small anyway. Existing callers of alloc_pages_exact_node() are simply converted to call __alloc_pages_node(), with the exception of sba_alloc_coherent() which open-codes the check for NUMA_NO_NODE, so it is converted to use alloc_pages_node() instead. This means it no longer performs some VM_BUG_ON checks, and since the current check for nid in alloc_pages_node() uses a 'nid < 0' comparison (which includes NUMA_NO_NODE), it may hide wrong values which would be previously exposed. Both differences will be rectified by the next patch. To sum up, this patch makes no functional changes, except temporarily hiding potentially buggy callers. Restricting the checks in alloc_pages_node() is left for the next patch which can in turn expose more existing buggy callers. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Robin Holt <robinmholt@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Mel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Cliff Whickman <cpw@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-08 15:35:28 -07:00
Linus Torvalds	12f03ee606	libnvdimm for 4.3: 1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic mechanism for adding device-driver-discovered memory regions to the kernel's direct map. This facility is used by the pmem driver to enable pfn_to_page() operations on the page frames returned by DAX ('direct_access' in 'struct block_device_operations'). For now, the 'memmap' allocation for these "device" pages comes from "System RAM". Support for allocating the memmap from device memory will arrive in a later kernel. 2/ Introduce memremap() to replace usages of ioremap_cache() and ioremap_wt(). memremap() drops the __iomem annotation for these mappings to memory that do not have i/o side effects. The replacement of ioremap_cache() with memremap() is limited to the pmem driver to ease merging the api change in v4.3. Completion of the conversion is targeted for v4.4. 3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem driver, update the VFS DAX implementation and PMEM api to provide persistence guarantees for kernel operations on a DAX mapping. 4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as cacheable to improve performance. 5/ Miscellaneous updates and fixes to libnvdimm including support for issuing "address range scrub" commands, clarifying the optimal 'sector size' of pmem devices, a clarification of the usage of the ACPI '_STA' (status) property for DIMM devices, and other minor fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5 JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR 1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1 o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn slsw6DkrWT60CRE42nbK =o57/ -----END PGP SIGNATURE----- Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "This update has successfully completed a 0day-kbuild run and has appeared in a linux-next release. The changes outside of the typical drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and the introduction of ZONE_DEVICE + devm_memremap_pages(). Summary: - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic mechanism for adding device-driver-discovered memory regions to the kernel's direct map. This facility is used by the pmem driver to enable pfn_to_page() operations on the page frames returned by DAX ('direct_access' in 'struct block_device_operations'). For now, the 'memmap' allocation for these "device" pages comes from "System RAM". Support for allocating the memmap from device memory will arrive in a later kernel. - Introduce memremap() to replace usages of ioremap_cache() and ioremap_wt(). memremap() drops the __iomem annotation for these mappings to memory that do not have i/o side effects. The replacement of ioremap_cache() with memremap() is limited to the pmem driver to ease merging the api change in v4.3. Completion of the conversion is targeted for v4.4. - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem driver, update the VFS DAX implementation and PMEM api to provide persistence guarantees for kernel operations on a DAX mapping. - Convert the ACPI NFIT 'BLK' driver to map the block apertures as cacheable to improve performance. - Miscellaneous updates and fixes to libnvdimm including support for issuing "address range scrub" commands, clarifying the optimal 'sector size' of pmem devices, a clarification of the usage of the ACPI '_STA' (status) property for DIMM devices, and other minor fixes" * tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits) libnvdimm, pmem: direct map legacy pmem by default libnvdimm, pmem: 'struct page' for pmem libnvdimm, pfn: 'struct page' provider infrastructure x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB add devm_memremap_pages mm: ZONE_DEVICE for "device memory" mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h dax: drop size parameter to ->direct_access() nd_blk: change aperture mapping from WC to WB nvdimm: change to use generic kvfree() pmem, dax: have direct_access use __pmem annotation dax: update I/O path to do proper PMEM flushing pmem: add copy_from_iter_pmem() and clear_pmem() pmem, x86: clean up conditional pmem includes pmem: remove layer when calling arch_has_wmb_pmem() pmem, x86: move x86 PMEM API to new pmem.h header libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option pmem: switch to devm_ allocations devres: add devm_memremap libnvdimm, btt: write and validate parent_uuid ...	2015-09-08 14:35:59 -07:00
Linus Torvalds	dab3c3cc4f	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull core kbuild updates from Michal Marek: - modpost portability fix - linker script fix - genksyms segfault fix - fixdep cleanup - fix for clang detection * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kbuild: Fix clang detection kbuild: fixdep: drop meaningless hash table initialization kbuild: fixdep: optimize code slightly genksyms: Regenerate parser genksyms: Duplicate function pointer type definitions segfault kbuild: Fix .text.unlikely placement Avoid conflict with host definitions when cross-compiling	2015-09-08 14:12:19 -07:00
Linus Torvalds	59a47fff02	Mostly this is just clean ups and micro optimizations. The changes with more meat are: o Allowing the trace event filters to filter on CPU number and process ids o Two new markers for trace output latency were added (10 and 100 msec latencies) o Have tracing_thresh filter function profiling time I also worked on modifying the ring buffer code for some future work, and moved the adding of the timestamp around. One of my changes caused a regression, and since other changes were built on top of it and already tested, I had to operate a revert of that change. Instead of rebasing, this change set has the code that caused a regression as well as the code to revert that change without touching the other changes that were made on top of it. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJV6aZEAAoJEEjnJuOKh9ldrR4H/A1RcQf1prLLoUibPP4w3lat dmQcdpS1NY+cqyiKuKPAOkFDGQL7qWzRqZ8whcPSJIsHq57ufqNSLf+0bbQYPzg9 g3CgGL7OApmGi5ulj0sNxhadvc9TFm/SAN0nVJlNuUWdm8e1UWHLsrJZaMfopu2r RDEtkOhg619mhDL4rktNdS6rk0B92Fhu2o2PwLZPVlUl1NNEt4WJU+ejitXUVO1A Nb70/rTGGJKtyHbW+74on4LnEN5Uu0Viu6rMwGfYyIgRmC2otdBDvE4xfKMiTUKr SzBjzrhIoMIRn4Vl0vElfulkpYaw7pcC2BdpZ4d9VpIOiLSlZs0x/TgCtpFEv5M= =baZ3 -----END PGP SIGNATURE----- Merge tag 'trace-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing update from Steven Rostedt: "Mostly this is just clean ups and micro optimizations. The changes with more meat are: - Allowing the trace event filters to filter on CPU number and process ids - Two new markers for trace output latency were added (10 and 100 msec latencies) - Have tracing_thresh filter function profiling time I also worked on modifying the ring buffer code for some future work, and moved the adding of the timestamp around. One of my changes caused a regression, and since other changes were built on top of it and already tested, I had to operate a revert of that change. Instead of rebasing, this change set has the code that caused a regression as well as the code to revert that change without touching the other changes that were made on top of it" * tag 'trace-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ring-buffer: Revert "ring-buffer: Get timestamp after event is allocated" tracing: Don't make assumptions about length of string on task rename tracing: Allow triggers to filter for CPU ids and process names ftrace: Format MCOUNT_ADDR address as type unsigned long tracing: Introduce two additional marks for delay ftrace: Fix function_graph duration spacing with 7-digits ftrace: add tracing_thresh to function profile tracing: Clean up stack tracing and fix fentry updates ring-buffer: Reorganize function locations ring-buffer: Make sure event has enough room for extend and padding ring-buffer: Get timestamp after event is allocated ring-buffer: Move the adding of the extended timestamp out of line ring-buffer: Add event descriptor to simplify passing data ftrace: correct the counter increment for trace_buffer data tracing: Fix for non-continuous cpu ids tracing: Prefer kcalloc over kzalloc with multiply	2015-09-08 14:04:14 -07:00
Bharata B Rao	daebaabb5c	powerpc/pseries: Release DRC when configure_connector fails Commit `f32393c943` ("powerpc/pseries: Correct cpu affinity for dlpar added cpus") moved dlpar_acquire_drc() call to before dlpar_configure_connector() call in dlpar_cpu_probe(), but missed to release the DRC if dlpar_configure_connector() failed. During CPU hotplug, if configure-connector fails for any reason, then this will result in subsequent CPU hotplug attempts to fail. Release the acquired DRC if dlpar_configure_connector() call fails so that the DRC is left in right isolation and allocation state for the subsequent hotplug operation to succeed. Fixes: `f32393c943` ("powerpc/pseries: Correct cpu affinity for dlpar added cpus") Cc: stable@vger.kernel.org # 4.1+ Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-08 14:07:58 +10:00
Nishanth Aravamudan	fa14486979	powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel= The 32-bit TCE table initialization relies on the DMA window having a size equal to a power of 2 (and checks for it explicitly). But crashkernel= has no constraint that requires a power-of-2 be specified. This causes the kdump kernel to fail to boot as none of the PCI devices (including the disk controller) are successfully initialized. After this change, the PCI devices successfully set up the 32-bit TCE table and kdump succeeds. Fixes: `aca6913f55` ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # 4.2 Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-07 20:14:18 +10:00
Nishanth Aravamudan	bb0054552d	powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernel When attempting to kdump with the 4.2 kernel, we see for each PCI device: pci 0003:01 : [PE# 000] Assign DMA32 space pci 0003:01 : [PE# 000] Setting up 32-bit TCE table at 0..80000000 pci 0003:01 : [PE# 000] Failed to create 32-bit TCE table, err -22 PCI: Domain 0004 has 8 available 32-bit DMA segments PCI: 4 PE# for a total weight of 70 pci 0004:01 : [PE# 002] Assign DMA32 space pci 0004:01 : [PE# 002] Setting up 32-bit TCE table at 0..80000000 pci 0004:01 : [PE# 002] Failed to create 32-bit TCE table, err -22 pci 0004:0d : [PE# 005] Assign DMA32 space pci 0004:0d : [PE# 005] Setting up 32-bit TCE table at 0..80000000 pci 0004:0d : [PE# 005] Failed to create 32-bit TCE table, err -22 pci 0004:0e : [PE# 006] Assign DMA32 space pci 0004:0e : [PE# 006] Setting up 32-bit TCE table at 0..80000000 pci 0004:0e : [PE# 006] Failed to create 32-bit TCE table, err -22 pci 0004:10 : [PE# 008] Assign DMA32 space pci 0004:10 : [PE# 008] Setting up 32-bit TCE table at 0..80000000 pci 0004:10 : [PE# 008] Failed to create 32-bit TCE table, err -22 and eventually the kdump kernel fails to boot as none of the PCI devices (including the disk controller) are successfully initialized. The EINVAL response is because the DMA window (the 2GB base window) is larger than the kdump kernel's reserved memory (crashkernel=, in this case specified to be 1024M). The check in question, if ((window_size > memory_hotplug_max()) \|\| !is_power_of_2(window_size)) is a valid sanity check for pnv_pci_ioda2_table_alloc_pages(), so adjust the caller to pass in a smaller window size if our maximum memory value is smaller than the DMA window. After this change, the PCI devices successfully set up the 32-bit TCE table and kdump succeeds. The problem was seen on a Firestone machine originally. Fixes: `aca6913f55` ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Cc: stable@vger.kernel.org # 4.2 Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Coding style pedantry, use u64, change the indentation] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-09-07 20:13:14 +10:00
Michal Marek	5631d9c429	kbuild: Fix clang detection We cannot detect clang before including the arch Makefile, because that can set the default cross compiler. We also cannot detect clang after including the arch Makefile, because powerpc wants to know about clang. Solve this by using an deferred variable. This costs us a few shell invocations, but this is only a constant number. Reported-by: Behan Webster <behanw@converseincode.com> Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michal Marek <mmarek@suse.com>	2015-09-04 13:14:10 +02:00
Linus Torvalds	ff474e8ca8	powerpc updates for 4.3 - Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask from Benjamin Herrenschmidt - EEH fixes for SRIOV from Gavin - Introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth - Use hardware RNG for arch_get_random_seed_* not arch_get_random_* from Paul Mackerras - Seccomp filter support from Michael Ellerman - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh Salgaonkar - Add powerpc timebase as a trace clock source from Naveen N. Rao - Misc cleanups in the xmon, signal & SLB code from Anshuman Khandual - Add an inline function to update POWER8 HID0 from Gautham R. Shenoy - Fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman - Drop support for 64K local store on 4K kernels from Michael Ellerman - move dma_get_required_mask() from pnv_phb to pci_controller_ops from Andrew Donnellan - Initialize distance lookup table from drconf path from Nikunj A Dadhania - Enable RTC class support from Vaibhav Jain - Disable automatically blocked PCI config from Gavin Shan - Add LEDs driver for PowerNV platform from Vasant Hegde - Fix endianness issues in the HVSI driver from Laurent Dufour - Kexec endian fixes from Samuel Mendoza-Jonas - Fix corrupted pdn list from Gavin Shan - Fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan - Freescale updates from Scott: Highlights include 32-bit memcpy/memset optimizations, checksum optimizations, 85xx config fragments and updates, device tree updates, e6500 fixes for non-SMP, and misc cleanup and minor fixes. - A ton of cxl updates & fixes: - Add explicit precision specifiers from Rasmus Villemoes - use more common format specifier from Rasmus Villemoes - Destroy cxl_adapter_idr on module_exit from Johannes Thumshirn - Destroy afu->contexts_idr on release of an afu from Johannes Thumshirn - Compile with -Werror from Daniel Axtens - EEH support from Daniel Axtens - Plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain - Add alternate MMIO error handling from Ian Munsie - Allow release of contexts which have been OPENED but not STARTED from Andrew Donnellan - Remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar - Release irqs if memory allocation fails from Vaibhav Jain - Remove racy attempt to force EEH invocation in reset from Daniel Axtens - Fix + cleanup error paths in cxl_dev_context_init from Ian Munsie - Fix force unmapping mmaps of contexts allocated through the kernel api from Ian Munsie - Set up and enable PSL Timebase from Philippe Bergheaud -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV5+GzAAoJEFHr6jzI4aWA0iAP/jcd0kNaNBzLgcDKKygKdgz4 xn4EWu81vfMfZYWesb0ATrjlH0hLsRxSXoFUqUMhtJTa5kNAoCIaz/M8WBALS50h aT+i7br4WEU2j2FcaMyP3iAZx/2hl+2utODJSHPRWPkec1fUDBfEyBf++e520RWM HUQGIGZXh8yq7KMA96Pwhsvls9vOB8hS2UdU/NS8ff3J5jFvXC1/WmF2qfzJBS1V 8iHyz26Jl8+dJ+et7iC2oD5XQAjIH1oJgOyPVPBzAQttfi8RjuVzRA30TfPBAUwI lC9nlmPy6bCe4kiQYWVB1z7GegHyW/9vkeuMj/u8mZbqpaayMEMZmd2C3iNDXNHx i2NSvdln539t4qWYsV2v6lVCfa/ayDHD73Wackj5Dk394tzXnpCPhxNzc2yKEd5v h7vwYc9jBhsbfSCSogaM+gSHJ1APgCidggHJMYYNA2nN2u6V62RpsMB7zp/1+Q2v yqYdD8oYF4Dm21x/ujaNFrlizROD46WS0UqdJ3yP6HAqRYIpRXtibmpECJgt1n5h HjADEci4hQ2UQxdMdp/Q5KZnPTJebBtrZrmkW5r6cZBUaTB5TVkFaEWN44CT/Loh tMNeA3qOBN06CaQS2WL3UUUWpbZq9fSbWuUZ5lWZDb5AOyRxe5eWVYNLkiyIXozY L24l1bYdBhXahnjoS/kc =n9+X -----END PGP SIGNATURE----- Merge tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask from Benjamin Herrenschmidt - EEH fixes for SRIOV from Gavin - introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth - use hardware RNG for arch_get_random_seed_* not arch_get_random_* from Paul Mackerras - seccomp filter support from Michael Ellerman - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh Salgaonkar - add powerpc timebase as a trace clock source from Naveen N. Rao - misc cleanups in the xmon, signal & SLB code from Anshuman Khandual - add an inline function to update POWER8 HID0 from Gautham R. Shenoy - fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman - drop support for 64K local store on 4K kernels from Michael Ellerman - move dma_get_required_mask() from pnv_phb to pci_controller_ops from Andrew Donnellan - initialize distance lookup table from drconf path from Nikunj A Dadhania - enable RTC class support from Vaibhav Jain - disable automatically blocked PCI config from Gavin Shan - add LEDs driver for PowerNV platform from Vasant Hegde - fix endianness issues in the HVSI driver from Laurent Dufour - kexec endian fixes from Samuel Mendoza-Jonas - fix corrupted pdn list from Gavin Shan - fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan - Freescale updates from Scott: Highlights include 32-bit memcpy/memset optimizations, checksum optimizations, 85xx config fragments and updates, device tree updates, e6500 fixes for non-SMP, and misc cleanup and minor fixes. - a ton of cxl updates & fixes: - add explicit precision specifiers from Rasmus Villemoes - use more common format specifier from Rasmus Villemoes - destroy cxl_adapter_idr on module_exit from Johannes Thumshirn - destroy afu->contexts_idr on release of an afu from Johannes Thumshirn - compile with -Werror from Daniel Axtens - EEH support from Daniel Axtens - plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain - add alternate MMIO error handling from Ian Munsie - allow release of contexts which have been OPENED but not STARTED from Andrew Donnellan - remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar - release irqs if memory allocation fails from Vaibhav Jain - remove racy attempt to force EEH invocation in reset from Daniel Axtens - fix + cleanup error paths in cxl_dev_context_init from Ian Munsie - fix force unmapping mmaps of contexts allocated through the kernel api from Ian Munsie - set up and enable PSL Timebase from Philippe Bergheaud * tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (140 commits) cxl: Set up and enable PSL Timebase cxl: Fix force unmapping mmaps of contexts allocated through the kernel api cxl: Fix + cleanup error paths in cxl_dev_context_init powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail() powerpc/pseries: Cleanup on pci_dn_reconfig_notifier() powerpc/pseries: Fix corrupted pdn list powerpc/powernv: Enable LEDS support powerpc/iommu: Set default DMA offset in dma_dev_setup cxl: Remove racy attempt to force EEH invocation in reset cxl: Release irqs if memory allocation fails cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver powerpc/powernv: Reset HILE before kexec_sequence() powerpc/kexec: Reset secondary cpu endianness before kexec powerpc/hvsi: Fix endianness issues in the HVSI driver leds/powernv: Add driver for PowerNV platform powerpc/powernv: Create LED platform device powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states powerpc/powernv: Fix the log message when disabling VF cxl: Allow release of contexts which have been OPENED but not STARTED ...	2015-09-03 16:41:38 -07:00
Linus Torvalds	ca520cab25	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking and atomic updates from Ingo Molnar: "Main changes in this cycle are: - Extend atomic primitives with coherent logic op primitives (atomic_{or,and,xor}()) and deprecate the old partial APIs (atomic_{set,clear}_mask()) The old ops were incoherent with incompatible signatures across architectures and with incomplete support. Now every architecture supports the primitives consistently (by Peter Zijlstra) - Generic support for 'relaxed atomics': - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return() - atomic_read_acquire() - atomic_set_release() This came out of porting qwrlock code to arm64 (by Will Deacon) - Clean up the fragile static_key APIs that were causing repeat bugs, by introducing a new one: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. To be able to know the 'type' of the static key we encode it in the jump entry (by Peter Zijlstra) - Static key self-tests (by Jason Baron) - qrwlock optimizations (by Waiman Long) - small futex enhancements (by Davidlohr Bueso) - ... and misc other changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) jump_label/x86: Work around asm build bug on older/backported GCCs locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h locking/qrwlock: Make use of _{acquire\|release\|relaxed}() atomics locking/qrwlock: Implement queue_write_unlock() using smp_store_release() locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition locking, asm-generic: Add _{relaxed\|acquire\|release}() variants for 'atomic_long_t' locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication locking/atomics: Add _{acquire\|release\|relaxed}() variants of some atomic operations locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic locking/static_keys: Make verify_keys() static jump label, locking/static_keys: Update docs locking/static_keys: Provide a selftest jump_label: Provide a self-test s390/uaccess, locking/static_keys: employ static_branch_likely() x86, tsc, locking/static_keys: Employ static_branch_likely() locking/static_keys: Add selftest locking/static_keys: Add a new static_key interface locking/static_keys: Rework update logic locking/static_keys: Add static_key_{en,dis}able() helpers ...	2015-09-03 15:46:07 -07:00
Greg Kurz	4e33d1f0a1	KVM: PPC: Book3S: Fix typo in top comment about locking Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-09-04 07:28:05 +10:00
Thomas Huth	f35f3a48d6	KVM: PPC: Book3S: Fix size of the PSPB register The size of the Problem State Priority Boost Register is only 32 bits, but the kvm_vcpu_arch->pspb variable is declared as "ulong", ie. 64-bit. However, the assembler code accesses this variable with 32-bit accesses, and the KVM_REG_PPC_PSPB macro is defined with SIZE_U32, too, so that the current code is broken on big endian hosts: kvmppc_get_one_reg_hv() will only return zero for this register since it is using the wrong half of the pspb variable. Let's fix this problem by adjusting the size of the pspb field in the kvm_vcpu_arch structure. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-09-03 16:09:09 +10:00
Gautham R. Shenoy	06554d9f6c	KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is set The code that handles the case when we receive a H_DOORBELL interrupt has a comment which says "Hypervisor doorbell - exit only if host IPI flag set". However, the current code does not actually check if the host IPI flag is set. This is due to a comparison instruction that got missed. As a result, the current code performs the exit to host only if some sibling thread or a sibling sub-core is exiting to the host. This implies that, an IPI sent to a sibling core in (subcores-per-core != 1) mode will be missed by the host unless the sibling core is on the exit path to the host. This patch adds the missing comparison operation which will ensure that when HOST_IPI flag is set, we unconditionally exit to the host. Fixes: `66feed61cd` Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-09-03 16:08:34 +10:00
Gautham R. Shenoy	7f23532866	KVM: PPC: Book3S HV: Fix race in starting secondary threads The current dynamic micro-threading code has a race due to which a secondary thread naps when it is supposed to be running a vcpu. As a side effect of this, on a guest exit, the primary thread in kvmppc_wait_for_nap() finds that this secondary thread hasn't cleared its vcore pointer. This results in "CPU X seems to be stuck!" warnings. The race is possible since the primary thread on exiting the guests only waits for all the secondaries to clear its vcore pointer. It subsequently expects the secondary threads to enter nap while it unsplits the core. A secondary thread which hasn't yet entered the nap will loop in kvm_no_guest until its vcore pointer and the do_nap flag are unset. Once the core has been unsplit, a new vcpu thread can grab the core and set the do_nap flag before setting the vcore pointers of the secondary. As a result, the secondary thread will now enter nap via kvm_unsplit_nap instead of running the guest vcpu. Fix this by setting the do_nap flag after setting the vcore pointer in the PACA of the secondary in kvmppc_run_core. Also, ensure that a secondary thread doesn't nap in kvm_unsplit_nap when the vcore pointer in its PACA struct is set. Fixes: `b4deba5c41` Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2015-09-03 16:07:42 +10:00
Linus Torvalds	1081230b74	Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block Pull core block updates from Jens Axboe: "This first core part of the block IO changes contains: - Cleanup of the bio IO error signaling from Christoph. We used to rely on the uptodate bit and passing around of an error, now we store the error in the bio itself. - Improvement of the above from myself, by shrinking the bio size down again to fit in two cachelines on x86-64. - Revert of the max_hw_sectors cap removal from a revision again, from Jeff Moyer. This caused performance regressions in various tests. Reinstate the limit, bump it to a more reasonable size instead. - Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me. Most devices have huge trim limits, which can cause nasty latencies when deleting files. Enable the admin to configure the size down. We will look into having a more sane default instead of UINT_MAX sectors. - Improvement of the SGP gaps logic from Keith Busch. - Enable the block core to handle arbitrarily sized bios, which enables a nice simplification of bio_add_page() (which is an IO hot path). From Kent. - Improvements to the partition io stats accounting, making it faster. From Ming Lei. - Also from Ming Lei, a basic fixup for overflow of the sysfs pending file in blk-mq, as well as a fix for a blk-mq timeout race condition. - Ming Lin has been carrying Kents above mentioned patches forward for a while, and testing them. Ming also did a few fixes around that. - Sasha Levin found and fixed a use-after-free problem introduced by the bio->bi_error changes from Christoph. - Small blk cgroup cleanup from Viresh Kumar" * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits) blk: Fix bio_io_vec index when checking bvec gaps block: Replace SG_GAPS with new queue limits mask block: bump BLK_DEF_MAX_SECTORS to 2560 Revert "block: remove artifical max_hw_sectors cap" blk-mq: fix race between timeout and freeing request blk-mq: fix buffer overflow when reading sysfs file of 'pending' Documentation: update notes in biovecs about arbitrarily sized bios block: remove bio_get_nr_vecs() fs: use helper bio_add_page() instead of open coding on bi_io_vec block: kill merge_bvec_fn() completely md/raid5: get rid of bio_fits_rdev() md/raid5: split bio for chunk_aligned_read block: remove split code in blkdev_issue_{discard,write_same} btrfs: remove bio splitting and merge_bvec_fn() calls bcache: remove driver private bio splitting code block: simplify bio_add_page() block: make generic_make_request handle arbitrarily sized bios blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL) block: don't access bio->bi_error after bio_put() block: shrink struct bio down to 2 cache lines again ...	2015-09-02 13:10:25 -07:00
Linus Torvalds	ae98207309	Power management and ACPI material for v4.3-rc1 - ACPICA update to upstream revision 20150818 including method tracing extensions to allow more in-depth AML debugging in the kernel and a number of assorted fixes and cleanups (Bob Moore, Lv Zheng, Markus Elfring). - ACPI sysfs code updates and a documentation update related to AML method tracing (Lv Zheng). - ACPI EC driver fix related to serialized evaluations of _Qxx methods and ACPI tools updates allowing the EC userspace tool to be built from the kernel source (Lv Zheng). - ACPI processor driver updates preparing it for future introduction of CPPC support and ACPI PCC mailbox driver updates (Ashwin Chaugule). - ACPI interrupts enumeration fix for a regression related to the handling of IRQ attribute conflicts between MADT and the ACPI namespace (Jiang Liu). - Fixes related to ACPI device PM (Mika Westerberg, Srinidhi Kasagar). - ACPI device registration code reorganization to separate the sysfs-related code and bus type operations from the rest (Rafael J Wysocki). - Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause, Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss). - ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan Xinhui, Rafael J Wysocki). - cpufreq core cleanups on top of the previous changes allowing it to preseve its sysfs directories over system suspend/resume (Viresh Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior). - cpufreq fixes and cleanups related to governors (Viresh Kumar). - cpufreq updates (core and the cpufreq-dt driver) related to the turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz). - New DT bindings for Operating Performance Points (OPP), support for them in the OPP framework and in the cpufreq-dt driver plus related OPP framework fixes and cleanups (Viresh Kumar). - cpufreq powernv driver updates (Shilpasri G Bhat). - New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen). - Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean). - intel_pstate driver updates including Skylake-S support, support for enabling HW P-states per CPU and an additional vendor bypass list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao). - cpuidle core fixes related to the handling of coupled idle states (Xunlei Pang). - intel_idle driver updates including Skylake Client support and support for freeze-mode-specific idle states (Len Brown). - Driver core updates related to power management (Andy Shevchenko, Rafael J Wysocki). - Generic power domains framework fixes and cleanups (Jon Hunter, Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson). - Device PM QoS framework update to allow the latency tolerance setting to be exposed to user space via sysfs (Mika Westerberg). - devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas). - System sleep support updates (Alan Stern, Len Brown, SungEun Kim). - rockchip-io AVS support updates (Heiko Stuebner). - PM core clocks support fixup (Colin Ian King). - Power capping RAPL driver update including support for Skylake H/S and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi). - Generic device properties framework fixes related to the handling of static (driver-provided) property sets (Andy Shevchenko). - turbostat and cpupower updates (Len Brown, Shilpasri G Bhat, Shreyas B Prabhu). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJV5hhGAAoJEILEb/54YlRxs+EQAK51iFk48+IbpHYaZZ50Yo4m ZZc2zBcbwRcBlU9vKERrhG+jieSl8J/JJNxT8vBjKqyvNw038mCjewQh02ol0HuC R7nlDiVJkmZ50sLO4xwE/1UBZr/XqbddwCUnYzvFMkMTA0ePzFtf8BrJ1FXpT8S/ fkwSXQty6hvJDwxkfrbMSaA730wMju9lahx8D6MlmUAedWYZOJDMQKB4WKa/St5X 9uckBPHUBB2KiKlXxdbFPwKLNxHvLROq5SpDLc6cM/7XZB+QfNFy85CUjCUtYo1O 1W8k0qnztvZ6UEv27qz5dejGyAGOarMWGGNsmL9evoeGeHRpQL+dom7HcTnbAfUZ walyhYSm/zKkdy7Vl3xWUUQkMG48+PviMI6K0YhHXb3Rm5wlR/yBNZTwNIty9SX/ fKCHEa8QynWwLxgm53c3xRkiitJxMsHNK03moLD9zQMjshTyTNvpNbZoahyKQzk6 H+9M1DBRHhkkREDWSwGutukxfEMtWe2vcZcyERrFiY7l5k1j58DwDBMPqjPhRv6q P/1NlCzr0XYf83Y86J18LbDuPGDhTjjIEn6CqbtI2mmWqTg3+rF7zvS2ux+FzMnA gisv8l6GT9JiWhxKFqqL/rrVpwtyHebWLYE/RpNUW6fEzLziRNj1qyYO9dqI/GGi I3rfxlXoc/5xJWCgNB8f =fTgI -----END PGP SIGNATURE----- Merge tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael Wysocki: "From the number of commits perspective, the biggest items are ACPICA and cpufreq changes with the latter taking the lead (over 50 commits). On the cpufreq front, there are many cleanups and minor fixes in the core and governors, driver updates etc. We also have a new cpufreq driver for Mediatek MT8173 chips. ACPICA mostly updates its debug infrastructure and adds a number of fixes and cleanups for a good measure. The Operating Performance Points (OPP) framework is updated with new DT bindings and support for them among other things. We have a few updates of the generic power domains framework and a reorganization of the ACPI device enumeration code and bus type operations. And a lot of fixes and cleanups all over. Included is one branch from the MFD tree as it contains some PM-related driver core and ACPI PM changes a few other commits are based on. Specifics: - ACPICA update to upstream revision 20150818 including method tracing extensions to allow more in-depth AML debugging in the kernel and a number of assorted fixes and cleanups (Bob Moore, Lv Zheng, Markus Elfring). - ACPI sysfs code updates and a documentation update related to AML method tracing (Lv Zheng). - ACPI EC driver fix related to serialized evaluations of _Qxx methods and ACPI tools updates allowing the EC userspace tool to be built from the kernel source (Lv Zheng). - ACPI processor driver updates preparing it for future introduction of CPPC support and ACPI PCC mailbox driver updates (Ashwin Chaugule). - ACPI interrupts enumeration fix for a regression related to the handling of IRQ attribute conflicts between MADT and the ACPI namespace (Jiang Liu). - Fixes related to ACPI device PM (Mika Westerberg, Srinidhi Kasagar). - ACPI device registration code reorganization to separate the sysfs-related code and bus type operations from the rest (Rafael J Wysocki). - Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause, Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss). - ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan Xinhui, Rafael J Wysocki). - cpufreq core cleanups on top of the previous changes allowing it to preseve its sysfs directories over system suspend/resume (Viresh Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior). - cpufreq fixes and cleanups related to governors (Viresh Kumar). - cpufreq updates (core and the cpufreq-dt driver) related to the turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz). - New DT bindings for Operating Performance Points (OPP), support for them in the OPP framework and in the cpufreq-dt driver plus related OPP framework fixes and cleanups (Viresh Kumar). - cpufreq powernv driver updates (Shilpasri G Bhat). - New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen). - Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean). - intel_pstate driver updates including Skylake-S support, support for enabling HW P-states per CPU and an additional vendor bypass list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao). - cpuidle core fixes related to the handling of coupled idle states (Xunlei Pang). - intel_idle driver updates including Skylake Client support and support for freeze-mode-specific idle states (Len Brown). - Driver core updates related to power management (Andy Shevchenko, Rafael J Wysocki). - Generic power domains framework fixes and cleanups (Jon Hunter, Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson). - Device PM QoS framework update to allow the latency tolerance setting to be exposed to user space via sysfs (Mika Westerberg). - devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas). - System sleep support updates (Alan Stern, Len Brown, SungEun Kim). - rockchip-io AVS support updates (Heiko Stuebner). - PM core clocks support fixup (Colin Ian King). - Power capping RAPL driver update including support for Skylake H/S and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi). - Generic device properties framework fixes related to the handling of static (driver-provided) property sets (Andy Shevchenko). - turbostat and cpupower updates (Len Brown, Shilpasri G Bhat, Shreyas B Prabhu)" * tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (180 commits) cpufreq: speedstep-lib: Use monotonic clock cpufreq: powernv: Increase the verbosity of OCC console messages cpufreq: sfi: use kmemdup rather than duplicating its implementation cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor() cpufreq: rename cpufreq_real_policy as cpufreq_user_policy cpufreq: remove redundant 'policy' field from user_policy cpufreq: remove redundant 'governor' field from user_policy cpufreq: update user_policy.* on success cpufreq: use memcpy() to copy policy cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event cpufreq: mediatek: Add MT8173 cpufreq driver dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings PM / Domains: Fix typo in description of genpd_dev_pm_detach() PM / Domains: Remove unusable governor dummies PM / Domains: Make pm_genpd_init() available to modules PM / domains: Align column headers and data in pm_genpd_summary output powercap / RAPL: disable the 2nd power limit properly tools: cpupower: Fix error when running cpupower monitor PM / OPP: Drop unlikely before IS_ERR(_OR_NULL) PM / OPP: Fix static checker warning (broken 64bit big endian systems) ...	2015-09-01 19:45:46 -07:00
Linus Torvalds	089b669506	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "The usual stuff from trivial tree for 4.3 (kerneldoc updates, printk() fixes, Documentation and MAINTAINERS updates)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits) MAINTAINERS: update my e-mail address mod_devicetable: add space before */ scsi: a100u2w: trivial typo in printk i2c: Fix typo in i2c-bfin-twi.c treewide: fix typos in comment blocks Doc: fix trivial typo in SubmittingPatches proportions: Spelling s/consitent/consistent/ dm: Spelling s/consitent/consistent/ aic7xxx: Fix typo in error message pcmcia: Fix typo in locking documentation scsi/arcmsr: Fix typos in error log drm/nouveau/gr: Fix typo in nv10.c [SCSI] Fix printk typos in drivers/scsi staging: comedi: Grammar s/Enable support a/Enable support for a/ Btrfs: Spelling s/consitent/consistent/ README: GTK+ is a acronym ASoC: omap: Fix typo in config option description mm: tlb.c: Fix error message ntfs: super.c: Fix error log fix typo in Documentation/SubmittingPatches ...	2015-09-01 18:46:42 -07:00
Linus Torvalds	17e6b00ac4	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "This updated pull request does not contain the last few GIC related patches which were reported to cause a regression. There is a fix available, but I let it breed for a couple of days first. The irq departement provides: - new infrastructure to support non PCI based MSI interrupts - a couple of new irq chip drivers - the usual pile of fixlets and updates to irq chip drivers - preparatory changes for removal of the irq argument from interrupt flow handlers - preparatory changes to remove IRQF_VALID" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits) irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2 irqchip: Add documentation for the bcm2836 interrupt controller irqchip/bcm2835: Add support for being used as a second level controller irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ PCI: xilinx: Fix typo in function name irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance irqchip/gic: Only allow the primary GIC to set the CPU map PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove unicore32/irq: Prepare puv3_gpio_handler for irq argument removal tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal m68k/irq: Prepare irq handlers for irq argument removal C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal blackfin: Prepare irq handlers for irq argument removal arc/irq: Prepare idu_cascade_isr for irq argument removal sparc/irq: Use access helper irq_data_get_affinity_mask() sparc/irq: Use helper irq_data_get_irq_handler_data() parisc/irq: Use access helper irq_data_get_affinity_mask() mn10300/irq: Use access helper irq_data_get_affinity_mask() irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal ...	2015-09-01 14:33:35 -07:00
Linus Torvalds	5e359bf221	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Rather large, but nothing exiting: - new range check for settimeofday() to prevent that boot time becomes negative. - fix for file time rounding - a few simplifications of the hrtimer code - fix for the proc/timerlist code so the output of clock realtime timers is accurate - more y2038 work - tree wide conversion of clockevent drivers to the new callbacks" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (88 commits) hrtimer: Handle failure of tick_init_highres() gracefully hrtimer: Unconfuse switch_hrtimer_base() a bit hrtimer: Simplify get_target_base() by returning current base hrtimer: Drop return code of hrtimer_switch_to_hres() time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64() time: Introduce current_kernel_time64() time: Introduce struct itimerspec64 time: Add the common weak version of update_persistent_clock() time: Always make sure wall_to_monotonic isn't positive time: Fix nanosecond file time rounding in timespec_trunc() timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers cris/time: Migrate to new 'set-state' interface kernel: broadcast-hrtimer: Migrate to new 'set-state' interface xtensa/time: Migrate to new 'set-state' interface unicore/time: Migrate to new 'set-state' interface um/time: Migrate to new 'set-state' interface sparc/time: Migrate to new 'set-state' interface sh/localtimer: Migrate to new 'set-state' interface score/time: Migrate to new 'set-state' interface s390/time: Migrate to new 'set-state' interface ...	2015-09-01 14:04:50 -07:00
Linus Torvalds	25525bea46	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The dominant change in this cycle was the continued work to isolate kernel drivers from MTRR legacies: this tree gets rid of all kernel internal driver interfaces to MTRRs (mostly by rewriting it to proper PAT interfaces), the only access left is the /proc/mtrr ABI. This work was done by Luis R Rodriguez. There's also some related PCI interface additions for which I've Cc:-ed Bjorn" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del() s390/io: Add pci_iomap_wc() and pci_iomap_wc_range() drivers/dma/iop-adma: Use dma_alloc_writecombine() kernel-style drivers/video/fbdev/vt8623fb: Use arch_phys_wc_add() and pci_iomap_wc() drivers/video/fbdev/s3fb: Use arch_phys_wc_add() and pci_iomap_wc() drivers/video/fbdev/arkfb.c: Use arch_phys_wc_add() and pci_iomap_wc() PCI: Add pci_iomap_wc() variants drivers/video/fbdev/gxt4500: Use pci_ioremap_wc_bar() to map framebuffer drivers/video/fbdev/kyrofb: Use arch_phys_wc_add() and pci_ioremap_wc_bar() drivers/video/fbdev/i740fb: Use arch_phys_wc_add() and pci_ioremap_wc_bar() PCI: Add pci_ioremap_wc_bar() x86/mm: Make kernel/check.c explicitly non-modular x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular x86/mm/pat: Add comments to cachemode translation tables arch/*/io.h: Add ioremap_uc() to all architectures drivers/video/fbdev/atyfb: Use arch_phys_wc_add() and ioremap_wc() drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC drivers/video/fbdev/atyfb: Clarify ioremap() base and length used drivers/video/fbdev/atyfb: Carve out framebuffer length fudging into a helper x86/mm, asm-generic: Add IOMMU ioremap_uc() variant default ...	2015-09-01 10:07:40 -07:00
Rafael J. Wysocki	4ffe18c255	Merge branch 'pm-cpufreq' * pm-cpufreq: (53 commits) cpufreq: speedstep-lib: Use monotonic clock cpufreq: powernv: Increase the verbosity of OCC console messages cpufreq: sfi: use kmemdup rather than duplicating its implementation cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor() cpufreq: rename cpufreq_real_policy as cpufreq_user_policy cpufreq: remove redundant 'policy' field from user_policy cpufreq: remove redundant 'governor' field from user_policy cpufreq: update user_policy.* on success cpufreq: use memcpy() to copy policy cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event cpufreq: mediatek: Add MT8173 cpufreq driver dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings intel_pstate: append more Oracle OEM table id to vendor bypass list intel_pstate: Add SKY-S support intel_pstate: Fix possible overflow complained by Coverity cpufreq: Correct a freq check in cpufreq_set_policy() cpufreq: Lock CPU online/offline in cpufreq_register_driver() cpufreq: Replace recover_policy with new_policy in cpufreq_online() cpufreq: Separate CPU device registration from CPU online cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling ...	2015-09-01 15:52:35 +02:00
Linus Torvalds	a1d8561172	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The biggest change in this cycle is the rewrite of the main SMP load balancing metric: the CPU load/utilization. The main goal was to make the metric more precise and more representative - see the changelog of this commit for the gory details: `9d89c257df` ("sched/fair: Rewrite runnable load and utilization average tracking") It is done in a way that significantly reduces complexity of the code: 5 files changed, 249 insertions(+), 494 deletions(-) and the performance testing results are encouraging. Nevertheless we need to keep an eye on potential regressions, since this potentially affects every SMP workload in existence. This work comes from Yuyang Du. Other changes: - SCHED_DL updates. (Andrea Parri) - Simplify architecture callbacks by removing finish_arch_switch(). (Peter Zijlstra et al) - cputime accounting: guarantee stime + utime == rtime. (Peter Zijlstra) - optimize idle CPU wakeups some more - inspired by Facebook server loads. (Mike Galbraith) - stop_machine fixes and updates. (Oleg Nesterov) - Introduce the 'trace_sched_waking' tracepoint. (Peter Zijlstra) - sched/numa tweaks. (Srikar Dronamraju) - misc fixes and small cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits) sched/deadline: Fix comment in enqueue_task_dl() sched/deadline: Fix comment in push_dl_tasks() sched: Change the sched_class::set_cpus_allowed() calling context sched: Make sched_class::set_cpus_allowed() unconditional sched: Fix a race between __kthread_bind() and sched_setaffinity() sched: Ensure a task has a non-normalized vruntime when returning back to CFS sched/numa: Fix NUMA_DIRECT topology identification tile: Reorganize _switch_to() sched, sparc32: Update scheduler comments in copy_thread() sched: Remove finish_arch_switch() sched, tile: Remove finish_arch_switch sched, sh: Fold finish_arch_switch() into switch_to() sched, score: Remove finish_arch_switch() sched, avr32: Remove finish_arch_switch() sched, MIPS: Get rid of finish_arch_switch() sched, arm: Remove finish_arch_switch() sched/fair: Clean up load average references sched/fair: Provide runnable_load_avg back to cfs_rq sched/fair: Remove task and group entity load when they are dead sched/fair: Init cfs_rq's sched_entity load average ...	2015-08-31 20:26:22 -07:00
Linus Torvalds	7073bc6612	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU changes in this cycle are: - the combination of tree geometry-initialization simplifications and OS-jitter-reduction changes to expedited grace periods. These two are stacked due to the large number of conflicts that would otherwise result. - privatize smp_mb__after_unlock_lock(). This commit moves the definition of smp_mb__after_unlock_lock() to kernel/rcu/tree.h, in recognition of the fact that RCU is the only thing using this, that nothing else is likely to use it, and that it is likely to go away completely. - documentation updates. - torture-test updates. - misc fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) rcu,locking: Privatize smp_mb__after_unlock_lock() rcu: Silence lockdep false positive for expedited grace periods rcu: Don't disable CPU hotplug during OOM notifiers scripts: Make checkpatch.pl warn on expedited RCU grace periods rcu: Update MAINTAINERS entry rcu: Clarify CONFIG_RCU_EQS_DEBUG help text rcu: Fix backwards RCU_LOCKDEP_WARN() in synchronize_rcu_tasks() rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN() rcu: Make rcu_is_watching() really notrace cpu: Wait for RCU grace periods concurrently rcu: Create a synchronize_rcu_mult() rcu: Fix obsolete priority-boosting comment rcu: Use WRITE_ONCE in RCU_INIT_POINTER rcu: Hide RCU_NOCB_CPU behind RCU_EXPERT rcu: Add RCU-sched flavors of get-state and cond-sync rcu: Add fastpath bypassing funnel locking rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQS rcu: Pull out wait_event*() condition into helper function documentation: Describe new expedited stall warnings rcu: Add stall warnings to synchronize_sched_expedited() ...	2015-08-31 18:12:07 -07:00
Linus Torvalds	d4c90396ed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "Here is the crypto update for 4.3: API: - the AEAD interface transition is now complete. - add top-level skcipher interface. Drivers: - x86-64 acceleration for chacha20/poly1305. - add sunxi-ss Allwinner Security System crypto accelerator. - add RSA algorithm to qat driver. - add SRIOV support to qat driver. - add LS1021A support to caam. - add i.MX6 support to caam" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (163 commits) crypto: algif_aead - fix for multiple operations on AF_ALG sockets crypto: qat - enable legacy VFs MPI: Fix mpi_read_buffer crypto: qat - silence a static checker warning crypto: vmx - Fixing opcode issue crypto: caam - Use the preferred style for memory allocations crypto: caam - Propagate the real error code in caam_probe crypto: caam - Fix the error handling in caam_probe crypto: caam - fix writing to JQCR_MS when using service interface crypto: hash - Add AHASH_REQUEST_ON_STACK crypto: testmgr - Use new skcipher interface crypto: skcipher - Add top-level skcipher interface crypto: cmac - allow usage in FIPS mode crypto: sahara - Use dmam_alloc_coherent crypto: caam - Add support for LS1021A crypto: qat - Don't move data inside output buffer crypto: vmx - Fixing GHASH Key issue on little endian crypto: vmx - Fixing AES-CTR counter bug crypto: null - Add missing Kconfig tristate for NULL2 crypto: nx - Add forward declaration for struct crypto_aead ...	2015-08-31 17:38:39 -07:00
Linus Torvalds	f36fc04e4c	The clk framework changes for 4.3 are mostly updates to existing drivers and the addition of new clock drivers. Stephen Boyd has also done a lot of subsystem-wide driver clean-ups (thanks!). There are also fixes to the framework core and changes to better split clock provider drivers from clock consumer drivers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV5KelAAoJEKI6nJvDJaTUwaQP/RVb70v6XSgMIePuOq3iaECT bclCAyito3YFwykrPPmQ1DucHvEjlWopeFwKqEE9VjNl07TVIH/OMGeonb9yErIY aN+FMoA9RUGVexMhy004q5sSbOEihAqTgKWaOiYoY8zAfJfeTpYXUoy34FcrW7MB j/cDDJgigtWe9zzcdrW04oT454lXQaSQuGX39tDCR0s0S3soYU2JyjkyBGiO5Yid 1yIMq/nzI8SrCwxwD/nFwQNtg7lqiAN291Nbi4At1vvG5r4RhNveuLGv8uJ50XRB xwy0sdHLIVJrIJ8OUcs1sY8wxu7ghDS8u+vjTNO2RzBf3KZWbuXWX+yVM7JQi4Ty 0iL5hGbvERy5E9QSzzH+Ox2jVt5e/r/dyvRf3oBDPVrFXhKusYhn6JmdUVJkTZ83 GTw2sQdEpcmry4z/50/MaqpZuXVZ09VTOCTqp8ToseJjsz9jXxVhQ4HdAwLc8cmV txWGRXuBxCB+2o8M0oky3IKS69VFFH5u6QQ0KG8+JYOrDDG7GcnJsFeV7mQjlu8g 3evYUILNAUfJGBpkOeLs654KUBHwUyXc87cUIKwjGaPruWb2048+kdCVrL3IFwPb sS/7Qn3DQ90pHFUTssDnWLz3X0IWT3H0iV4zZyAqqdARugEo+mpykmXmMWcWc3VR MrD1l3GVxLegEf242Zpo =QAiQ -----END PGP SIGNATURE----- Merge tag 'clk-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Michael Turquette: "The clk framework changes for 4.3 are mostly updates to existing drivers and the addition of new clock drivers. Stephen Boyd has also done a lot of subsystem-wide driver clean-ups (thanks!). There are also fixes to the framework core and changes to better split clock provider drivers from clock consumer drivers" * tag 'clk-for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (227 commits) clk: s5pv210: add missing call to samsung_clk_of_add_provider() clk: pistachio: correct critical clock list clk: pistachio: Fix PLL rate calculation in integer mode clk: pistachio: Fix override of clk-pll settings from boot loader clk: pistachio: Fix 32bit integer overflows clk: tegra: Fix some static checker problems clk: qcom: Fix MSM8916 prng clock enable bit clk: Add missing header for 'bool' definition to clk-conf.h drivers/clk: appropriate __init annotation for const data clk: rockchip: register pll mux before pll itself clk: add bindings for the Ux500 clocks clk/ARM: move Ux500 PRCC bases to the device tree clk: remove duplicated code with __clk_set_parent_after clk: Convert __clk_get_name(hw->clk) to clk_hw_get_name(hw) clk: Constify clk_hw argument to provider APIs clk: Hi6220: add stub clock driver dt-bindings: clk: Hi6220: Document stub clock driver dt-bindings: arm: Hi6220: add doc for SRAM controller clk: atlas7: fix pll missed divide NR in fraction mode clk: atlas7: fix bit field and its root clk for coresight_tpiu ...	2015-08-31 17:26:48 -07:00
Linus Torvalds	26f8b7edc9	PCI changes for the v4.3 merge window: Enumeration Allocate ATS struct during enumeration (Bjorn Helgaas) Embed ATS info directly into struct pci_dev (Bjorn Helgaas) Reduce size of ATS structure elements (Bjorn Helgaas) Stop caching ATS Invalidate Queue Depth (Bjorn Helgaas) iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth (Bjorn Helgaas) Move MPS configuration check to pci_configure_device() (Bjorn Helgaas) Set MPS to match upstream bridge (Keith Busch) ARM/PCI: Set MPS before pci_bus_add_devices() (Murali Karicheri) Add pci_scan_root_bus_msi() (Lorenzo Pieralisi) ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() (Lorenzo Pieralisi) Resource management Call pci_read_bridge_bases() from core instead of arch code (Lorenzo Pieralisi) PCI device hotplug pciehp: Remove unused interrupt events (Bjorn Helgaas) pciehp: Remove ignored MRL sensor interrupt events (Bjorn Helgaas) pciehp: Handle invalid data when reading from non-existent devices (Jarod Wilson) pciehp: Simplify pcie_poll_cmd() (Yijing Wang) Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot (Yijing Wang) Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem (Yijing Wang) Hold pci_slot_mutex while searching bus->slots list (Yijing Wang) Power management Disable async suspend/resume for JMicron multi-function SATA/AHCI (Zhang Rui) Virtualization Add ACS quirks for Intel I219-LM/V (Alex Williamson) Restore ACS configuration as part of pci_restore_state() (Alexander Duyck) MSI Add pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu) x86: Implement pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu) Add helpers to manage pci_dev->irq and pci_dev->irq_managed (Jiang Liu) Free legacy IRQ when enabling MSI/MSI-X (Jiang Liu) ARM/PCI: Remove msi_controller from struct pci_sys_data (Lorenzo Pieralisi) Remove unused pcibios_msi_controller() hook (Lorenzo Pieralisi) Generic host bridge driver Remove dependency on ARM-specific struct hw_pci (Jayachandran C) Build setup-irq.o for arm64 (Jayachandran C) Add arm64 support (Jayachandran C) APM X-Gene host bridge driver Add APM X-Gene PCIe 64-bit prefetchable window (Duc Dang) Add support for a 64-bit prefetchable memory window (Duc Dang) Drop owner assignment from platform_driver (Krzysztof Kozlowski) Broadcom iProc host bridge driver Allow BCMA bus driver to be built as module (Hauke Mehrtens) Delete unnecessary checks before phy calls (Markus Elfring) Add arm64 support (Ray Jui) Synopsys DesignWare host bridge driver Don't complain missing config reg space if va_cfg0 is set (Murali Karicheri) TI DRA7xx host bridge driver Disable pm_runtime on get_sync failure (Kishon Vijay Abraham I) Add PM support (Kishon Vijay Abraham I) Clear MSE bit during suspend so clocks will idle (Kishon Vijay Abraham I) Add support to make GPIO drive PERST# line (Kishon Vijay Abraham I) Xilinx AXI host bridge driver Check for MSI interrupt flag before handling as INTx (Russell Joyce) Miscellaneous Fix Intersil/Techwell TW686[4589] AV capture class code (Krzysztof Hałasa) Use PCI_CLASS_SERIAL_USB instead of bare number (Bjorn Helgaas) Fix generic NCR 53c810 class code quirk (Bjorn Helgaas) Fix TI816X class code quirk (Bjorn Helgaas) Remove unused "pci_probe" flags (Bjorn Helgaas) Host bridge driver code simplifications (Fabio Estevam) Add dev_flags bit to access VPD through function 0 (Mark Rustad) Add VPD function 0 quirk for Intel Ethernet devices (Mark Rustad) Kill off set_irq_flags() usage (Rob Herring) Remove Intel Cherrytrail D3 delays (Srinidhi Kasagar) Clean up pci_find_capability() (Wei Yang) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV5FE/AAoJEFmIoMA60/r8I2QP/R9b9MrvH2i9tN98/lTDl7g3 czE58ZM1d4kMYtW3Pm/DrYI6y6RprAaB4ZEp5rHxlFLqBPZEQwWodA19NkjECcb6 g5qKWOdIWA4T6Jaab6a/yCmAFa0jni7iAmmTYqca9o3Xj7tFovxDxqPSYkh+rer0 v+1sAr/4HXSiN339KR6teEF3VZqLFp6ewMydQlVS+R7kAOHHYQDqoo9WF6JnIoL5 PO3Kbmr1WN3fZY3s98yLq1x6XmLrLlmGdJI+2r+KewO4r/05CL6wTVP/oTMi+Eti dueseeISlOTcTAUhk87Vap23uJPeB/rJbYoFdCr7+0AkZGe/U/E2dpZm2wyMcCvq OrATuFymgzIuJm5uUPsdH4lzsX97U9BcDccracfC38rYnP5u3bqHCjw8HJzANR7p VYbFBzc5ZCCUYtQAjyrKt2820AvTFo+Bu+z75IsJO8LQQgv/zGtQQ8grIQeAjH+l sAe3xOTwzZnq6Obl4qb/GElHmIGUbQ1X4Dx1mliiijKMKkhYHOA0iFnB/OBILmEZ wHzKU8chWcI9lip0aaX8q9i/qovdVUt2+rdo/N40l7YY66x4jkNgQQXZX+FSKk6H stTvEBQgK28EKCHDxMsgzTGIqllSyk4DnRMA7ij1hRWqdUbGk7wOPTvm9QSwNDWe SokuWzAQD9YeMRGdsYjZ =DX1r -----END PGP SIGNATURE----- Merge tag 'pci-v4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v4.3 merge window: Enumeration: - Allocate ATS struct during enumeration (Bjorn Helgaas) - Embed ATS info directly into struct pci_dev (Bjorn Helgaas) - Reduce size of ATS structure elements (Bjorn Helgaas) - Stop caching ATS Invalidate Queue Depth (Bjorn Helgaas) - iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth (Bjorn Helgaas) - Move MPS configuration check to pci_configure_device() (Bjorn Helgaas) - Set MPS to match upstream bridge (Keith Busch) - ARM/PCI: Set MPS before pci_bus_add_devices() (Murali Karicheri) - Add pci_scan_root_bus_msi() (Lorenzo Pieralisi) - ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() (Lorenzo Pieralisi) Resource management: - Call pci_read_bridge_bases() from core instead of arch code (Lorenzo Pieralisi) PCI device hotplug: - pciehp: Remove unused interrupt events (Bjorn Helgaas) - pciehp: Remove ignored MRL sensor interrupt events (Bjorn Helgaas) - pciehp: Handle invalid data when reading from non-existent devices (Jarod Wilson) - pciehp: Simplify pcie_poll_cmd() (Yijing Wang) - Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot (Yijing Wang) - Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem (Yijing Wang) - Hold pci_slot_mutex while searching bus->slots list (Yijing Wang) Power management: - Disable async suspend/resume for JMicron multi-function SATA/AHCI (Zhang Rui) Virtualization: - Add ACS quirks for Intel I219-LM/V (Alex Williamson) - Restore ACS configuration as part of pci_restore_state() (Alexander Duyck) MSI: - Add pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu) - x86: Implement pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu) - Add helpers to manage pci_dev->irq and pci_dev->irq_managed (Jiang Liu) - Free legacy IRQ when enabling MSI/MSI-X (Jiang Liu) - ARM/PCI: Remove msi_controller from struct pci_sys_data (Lorenzo Pieralisi) - Remove unused pcibios_msi_controller() hook (Lorenzo Pieralisi) Generic host bridge driver: - Remove dependency on ARM-specific struct hw_pci (Jayachandran C) - Build setup-irq.o for arm64 (Jayachandran C) - Add arm64 support (Jayachandran C) APM X-Gene host bridge driver: - Add APM X-Gene PCIe 64-bit prefetchable window (Duc Dang) - Add support for a 64-bit prefetchable memory window (Duc Dang) - Drop owner assignment from platform_driver (Krzysztof Kozlowski) Broadcom iProc host bridge driver: - Allow BCMA bus driver to be built as module (Hauke Mehrtens) - Delete unnecessary checks before phy calls (Markus Elfring) - Add arm64 support (Ray Jui) Synopsys DesignWare host bridge driver: - Don't complain missing config reg space if va_cfg0 is set (Murali Karicheri) TI DRA7xx host bridge driver: - Disable pm_runtime on get_sync failure (Kishon Vijay Abraham I) - Add PM support (Kishon Vijay Abraham I) - Clear MSE bit during suspend so clocks will idle (Kishon Vijay Abraham I) - Add support to make GPIO drive PERST# line (Kishon Vijay Abraham I) Xilinx AXI host bridge driver: - Check for MSI interrupt flag before handling as INTx (Russell Joyce) Miscellaneous: - Fix Intersil/Techwell TW686[4589] AV capture class code (Krzysztof Hałasa) - Use PCI_CLASS_SERIAL_USB instead of bare number (Bjorn Helgaas) - Fix generic NCR 53c810 class code quirk (Bjorn Helgaas) - Fix TI816X class code quirk (Bjorn Helgaas) - Remove unused "pci_probe" flags (Bjorn Helgaas) - Host bridge driver code simplifications (Fabio Estevam) - Add dev_flags bit to access VPD through function 0 (Mark Rustad) - Add VPD function 0 quirk for Intel Ethernet devices (Mark Rustad) - Kill off set_irq_flags() usage (Rob Herring) - Remove Intel Cherrytrail D3 delays (Srinidhi Kasagar) - Clean up pci_find_capability() (Wei Yang)" * tag 'pci-v4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (72 commits) PCI: Disable async suspend/resume for JMicron multi-function SATA/AHCI PCI: Set MPS to match upstream bridge PCI: Move MPS configuration check to pci_configure_device() PCI: Drop references acquired by of_parse_phandle() PCI/MSI: Remove unused pcibios_msi_controller() hook ARM/PCI: Remove msi_controller from struct pci_sys_data ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() PCI: Add pci_scan_root_bus_msi() ARM/PCI: Replace panic with WARN messages on failures PCI: generic: Add arm64 support PCI: Build setup-irq.o for arm64 PCI: generic: Remove dependency on ARM-specific struct hw_pci PCI: imx6: Simplify a trivial if-return sequence PCI: spear: Use BUG_ON() instead of condition followed by BUG() PCI: dra7xx: Remove unneeded use of IS_ERR_VALUE() PCI: Remove pci_ats_enabled() PCI: Stop caching ATS Invalidate Queue Depth PCI: Move ATS declarations to linux/pci.h so they're all together PCI: Clean up ATS error handling PCI: Use pci_physfn() rather than looking up physfn by hand ...	2015-08-31 17:14:39 -07:00
Linus Torvalds	e5aeced6bc	spi: Updates for v4.3 A few core tweaks this time together with the usual collection of driver specific updates and fixes plus a larger than average selection of new device support: - Fix DMA mapping of unaligned vmalloc() buffers. - Statistics tracking transfer volumes exposed via sysfs. - New drivers for Freescale MPC5125, Intel Sunrise Point, Mediatek SoCs, and Netlogic XLP SoCs. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJV5F3EAAoJECTWi3JdVIfQ45kH/isvhUPuvaZS8zQggQ95xIjy 7uZMQzf2JHwmeFiYsqFuWdf6IVgGmg68cQ4ZIzyXk1quEJkOLNCrlOvn4Qv7z6lJ fqr5qbU+BWrtv0IjN5szIx7BR+ZWQ9LHdZI651A1XbCPvU8H1PZL+PfWDK6pMHCM tqCLUVgnwRYNT0ELmgl7Up8SWKMbs98Wit2dQzNvwrfEZ1GHTMk3ChYAABfMkKIU wbuug6+cv5+VdhV73nX29BZXJj7+EyRATQknADQT3nugcXkMysi0IvLZIrSFEJTp iBApQsO3EaOVUR5/hN6zHpS8lqZdEkQvlKRuodnW31dXLrIw7TyVDT1wWknRZeU= =Jykk -----END PGP SIGNATURE----- Merge tag 'spi-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi updates from Mark Brown: "A few core tweaks this time together with the usual collection of driver specific updates and fixes plus a larger than average selection of new device support: - fix DMA mapping of unaligned vmalloc() buffers - statistics tracking transfer volumes exposed via sysfs - new drivers for Freescale MPC5125, Intel Sunrise Point, Mediatek SoCs, and Netlogic XLP SoCs" * tag 'spi-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (66 commits) spi: sh-msiof: Fix FIFO size to 64 word from 256 word spi: fsl-(e)spi: Fix checking return value of devm_ioremap_resource spi: Add DT bindings documentation for Netlogic XLP SPI controller spi/xlp: SPI controller driver for Netlogic XLP SoCs spi: fsl-espi: add runtime PM spi: fsl-(e)spi: simplify cleanup code spi: fsl-(e)spi: migrate to using devm_ functions to simplify cleanup spi: mediatek: fix SPI_CMD_PAUSE_IE macro error spi: check bits_per_word in spi_setup spi: mediatek: replace _time name spi: mediatek: add PM clk_prepare_enable fail flow spi: mediatek: replace int with u32, delete TAB and define MTK_SPI_PAUSE_INT_STATUS marco spi: mediatek: add linux/io.h include file spi/bcm63xx-hsspi: add support for dual spi read/write spi: dw: Allow interface drivers to limit data I/O to word sizes dt: snps,dw-apb-ssi: Document new I/O data register width property spi: Fall back to master maximum speed if no slave speed specified spi: mediatek: use BIT() to instead of SPI_CMD__OFFSET spi: medaitek: revise quirks compatibility style spi: mediatek: fix spi incorrect endian usage ...	2015-08-31 15:55:49 -07:00
Mark Brown	18c558ec74	Merge remote-tracking branches 'spi/topic/dw', 'spi/topic/fsl-espi', 'spi/topic/img-spfi' and 'spi/topic/mpc512x-psc' into spi-next	2015-08-31 14:45:32 +01:00

... 5 6 7 8 9 ...

14696 Commits