linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Laurentiu TUDOR	4e21b94c9c	powerpc/85xx: Move ePAPR paravirt initialization earlier At console init, when the kernel tries to flush the log buffer the ePAPR byte-channel based console write fails silently, losing the buffered messages. This happens because The ePAPR para-virtualization init isn't done early enough so that the hcall instruction to be set, causing the byte-channel write hcall to be a nop. To fix, change the ePAPR para-virt init to use early device tree functions and move it in early init. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2013-08-07 18:38:06 -05:00
Lijun Pan	5815c434fd	powerpc/perf: add 2 additional performance monitor counters for e6500 core There are 6 counters in e6500 core instead of 4 in e500 core. Signed-off-by: Lijun Pan <Lijun.Pan@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2013-08-07 18:38:03 -05:00
Michael Ellerman	e8e813ed26	powerpc: Rename PMU interrupts from CNT to PMI Back in commit `89713ed` "Add timer, performance monitor and machine check counts to /proc/interrupts" we added a count of PMU interrupts to the output of /proc/interrupts. At the time we named them "CNT" to match x86. However in commit `89ccf46` "Rename 'performance counter interrupt'", the x86 guys renamed theirs from "CNT" to "PMI". Arguably changing the name could break someone's script, but I think the chance of that is minimal, and it's preferable to have a name that 1) is somewhat meaningful, and 2) matches x86. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-08-01 13:11:46 +10:00
Dongsheng Wang	e00c9a0c2e	powerpc/mpc85xx: invalidate TLB after hibernation resume This problem belongs to the core synchronization issues. The cpu1 already updated spin_table values, but bootcore cannot get this value in time. After bootcpu hibiernation restore the pages. we are now running with the kernel data of the old kernel fully restored. if we reset the non-bootcpus that will be reset cache(tlb), the non-bootcpus will get new address(map virtual and physical address spaces). but bootcpu tlb cache still use boot kernel data, so we need to invalidate the bootcpu tlb cache make it to get new main memory data. log: Enabling non-boot CPUs ... smp_85xx_kick_cpu: timeout waiting for core 1 to reset smp: failed starting cpu 1 (rc -2) Error taking CPU1 up: -2 Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com> Reviewed-by: Anton Vorontsov <anton@enomsg.org> [scottwood@freescale.com: reworded code comment for clarity] Signed-off-by: Scott Wood <scottwood@freescale.com>	2013-07-30 15:50:08 -05:00
Hongtao Jia	4e0e3435b5	powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx A PCIe erratum of mpc85xx may causes a core hang when a link of PCIe goes down. when the link goes down, Non-posted transactions issued via the ATMU requiring completion result in an instruction stall. At the same time a machine-check exception is generated to the core to allow further processing by the handler. We implements the handler which skips the instruction caused the stall. This patch depends on patch: powerpc/85xx: Add platform_device declaration to fsl_pci.h Signed-off-by: Zhao Chenhui <b35336@freescale.com> Signed-off-by: Li Yang <leoli@freescale.com> Signed-off-by: Liu Shuo <soniccat.liu@gmail.com> Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2013-07-30 15:50:07 -05:00
Gavin Shan	ab55d2187d	powerpc/eeh: Introdce flag to protect sysfs The patch introduces flag EEH_DEV_SYSFS to keep track that the sysfs entries for the corresponding EEH device (then PCI device) has been added or removed, in order to avoid race condition. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:49 +10:00
Gavin Shan	91150af3ad	powerpc/eeh: Fix unbalanced enable for IRQ The patch fixes following issue: Unbalanced enable for IRQ 23 ------------[ cut here ]------------ WARNING: at kernel/irq/manage.c:437 : NIP [c00000000016de8c] .__enable_irq+0x11c/0x140 LR [c00000000016de88] .__enable_irq+0x118/0x140 Call Trace: [c000003ea1f23880] [c00000000016de88] .__enable_irq+0x118/0x140 (unreliable) [c000003ea1f23910] [c00000000016df08] .enable_irq+0x58/0xa0 [c000003ea1f239a0] [c0000000000388b4] .eeh_enable_irq+0xc4/0xe0 [c000003ea1f23a30] [c000000000038a28] .eeh_report_reset+0x78/0x130 [c000003ea1f23ac0] [c000000000037508] .eeh_pe_dev_traverse+0x98/0x170 [c000003ea1f23b60] [c0000000000391ac] .eeh_handle_normal_event+0x2fc/0x3d0 [c000003ea1f23bf0] [c000000000039538] .eeh_handle_event+0x2b8/0x2c0 [c000003ea1f23c90] [c000000000039600] .eeh_event_handler+0xc0/0x170 [c000003ea1f23d30] [c0000000000da9a0] .kthread+0xf0/0x100 [c000003ea1f23e30] [c00000000000a1dc] .ret_from_kernel_thread+0x5c/0x80 Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:49 +10:00
Gavin Shan	4b83bd452f	powerpc/eeh: Don't use pci_dev during BAR restore While restoring BARs for one specific PCI device, the pci_dev instance should have been released. So it's not reliable to use the pci_dev instance on restoring BARs. However, we still need some information (e.g. PCIe capability position, header type) from the pci_dev instance. So we have to store those information to EEH device in advance. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:49 +10:00
Gavin Shan	f5c57710dd	powerpc/eeh: Use partial hotplug for EEH unaware drivers When EEH error happens to one specific PE, some devices with drivers supporting EEH won't except hotplug on the device. However, there might have other deivces without driver, or with driver without EEH support. For the case, we need do partial hotplug in order to make sure that the PE becomes absolutely quite during reset. Otherise, the PE reset might fail and leads to failure of error recovery. The current code doesn't handle that 'mixed' case properly, it either uses the error callbacks to the drivers, or tries hotplug, but doesn't handle a PE (EEH domain) composed of a combination of the two. The patch intends to support so-called "partial" hotplug for EEH: Before we do reset, we stop and remove those PCI devices without EEH sensitive driver. The corresponding EEH devices are not detached from its PE, but with special flag. After the reset is done, those EEH devices with the special flag will be scanned one by one. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:48 +10:00
Gavin Shan	ab444ec97e	powerpc/pci: Partial tree hotplug support When EEH error happens to one specific PE, the device drivers of its attached EEH devices (PCI devices) are checked to see the further action: reset with complete hotplug, or reset without hotplug. However, that's not enough for those PCI devices whose drivers can't support EEH, or those PCI devices without driver. So we need do so-called "partial hotplug" on basis of PCI devices. In the situation, part of PCI devices of the specific PE are unplugged and plugged again after PE reset. The patch changes pcibios_add_pci_devices() so that it can support full hotplug and so-called "partial" hotplug based on device-tree or real hardware. It's notable that pci_of_scan.c has been changed for a bit in order to support the "partial" hotplug based on dev-tree. Most of the generic code already supports that, we just need to plumb it properly on our side. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:48 +10:00
Gavin Shan	9feed42e93	powerpc/eeh: Use safe list traversal when walking EEH devices Currently, we're trasversing the EEH devices list using list_for_each_entry(). That's not safe enough because the EEH devices might be removed from its parent PE while doing iteration. The patch replaces that with list_for_each_entry_safe(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:47 +10:00
Gavin Shan	807a827d4e	powerpc/eeh: Keep PE during hotplug When we do normal hotplug, the PE (shadow EEH structure) shouldn't be kept around. However, we need to keep it if the hotplug an artifial one caused by EEH errors recovery. Since we remove EEH device through the PCI hook pcibios_release_device(), the flag "purge_pe" passed to various functions is meaningless. So the patch removes the meaningless flag and introduce new flag "EEH_PE_KEEP" to save the PE while doing hotplug during EEH error recovery. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:47 +10:00
Gavin Shan	008a4938ea	powerpc/pci: Override pcibios_release_device() The patch overrides pcibios_release_device() to release EEH resources (EEH cache, unbinding EEH device) for the indicated PCI device. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:46 +10:00
Gavin Shan	f2856491d2	powerpc/eeh: Export functions for hotplug Make some functions public in order to support hotplug on either specific PCI bus or PCI device in future. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:46 +10:00
Gavin Shan	0ba178888b	powerpc/eeh: Remove reference to PCI device We will rely on pcibios_release_device() to remove the EEH cache and unbind EEH device for the specific PCI device. So we shouldn't hold the reference to the PCI device from EEH cache and EEH device. Otherwise, pcibios_release_device() won't be called as we expected. The patch removes the reference to the PCI device in EEH core. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:46 +10:00
Anton Blanchard	0e0ed6406e	powerpc/modules: Module CRC relocation fix causes perf issues Module CRCs are implemented as absolute symbols that get resolved by a linker script. We build an intermediate .o that contains an unresolved symbol for each CRC. genksysms parses this .o, calculates the CRCs and writes a linker script that "resolves" the symbols to the calculated CRC. Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols that need relocating and relocates them at boot. Commit `d4703aef` (module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y) added a hook to reverse the bogus relocations. Part of this patch created a symbol at 0x0: # head -2 /proc/kallsyms 0000000000000000 T reloc_start c000000000000000 T .__start This reloc_start symbol is causing lots of confusion to perf. It thinks reloc_start is a massive function that stretches from 0x0 to 0xc000000000000000 and we get various cryptic errors out of perf, including: problem incrementing symbol count, skipping event This patch removes the reloc_start linker script label and instead defines it as PHYSICAL_START. We also need to wrap it with CONFIG_PPC64 because the ppc32 kernel can set a non zero PHYSICAL_START at compile time and we wouldn't want to subtract it from the CRCs in that case. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:43 +10:00
Michael Neuling	33959f88fc	powerpc: Add second POWER8 PVR entry POWER8 comes with two different PVRs. This patch enables the additional PVR in the cputable. The existing entry (PVR=0x4b) is renamed to POWER8E and the new entry (PVR=0x4d) is given POWER8. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-24 14:18:43 +10:00
Oleg Nesterov	6961ed96f1	ptrace/powerpc: revert "hw_breakpoints: Fix racy access to ptrace breakpoints" This reverts commit `07fa7a0a8a` ("hw_breakpoints: Fix racy access to ptrace breakpoints") and removes ptrace_get/put_breakpoints() added by other commits. The patch was fine but we can no longer race with SIGKILL after commit `9899d11f65` ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL"), the __TASK_TRACED tracee can't be woken up and ->ptrace_bps[] can't go away. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michael Neuling <mikey@neuling.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-09 10:33:25 -07:00
Linus Torvalds	74b9272bbe	Device tree updates for v3.11 This branch contains the following changes: - Removal of CONFIG_OF_DEVICE, it is always enabled by CONFIG_OF - Remove #ifdef from linux/of_platform.h to increase compiler syntax coverage - Bug fix for address decoding on Bimini and js2x powerpc platforms. - miscellaneous binding changes One note on the above. The binding changes going in from all kinds of different trees has gotten rather out of hand. I picked up some during this cycle, but even going though my tree isn't a great fit. Ian Campbell has prototyped splitting the bindings and .dtb files into a separate repository. The plan is to migrate to using that sometime in the next few kernel releases which should get rid of a lot of the churn on binding docs and .dts files. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJR1fP3AAoJEEFnBt12D9kB3IIP/0Q5ctMespiJ50+ThjGsaR3m sUbQkMK46uL/oupXaJT2ybX2PxLN5LpgvO9rPt77hblOoL0+wZt+j9G0pLy1qZQZ aHprH9SrpGJv6F0SFbHp/+D/m9vESPv+zwYzL9TvrOALvCD7OSZ7tHLaoF7Y1ADM QnZa7pta3Owpu5NsGXaTXLpaZzfXzfWzf4PDzv2FsAIDbtuVJZGJZ7sJVO7Z0r+K KCY85uKJ4VOHY0onBVlM6uoCnopOi2XMMkyxYvR28lL2Kiv2b3np46jG3zX1EZH5 Qxdu85QZn2oio9iaTeYKK8bG9aRIRsXnzCnF2s68n2rQlEtPpWKN9Lj2AS/KJ+Ig obFTOFDHmxt1F4GIA0/HIPkDvRd7GTIwgwYYubEMi44E3Mae0N+xzkIRE41vYP7s 8zaNHbjAjsYjplsvN5gTPxxiU/ta24a5bl7Ont2zmOjAbXCsDajm4NCKZRJ3lb2f FHNsS1zHGmqgJ9zt13GQabo/Tp4t3KwTzBirPQsDokRO4eoL6klcS3GCRv82VWC0 dLnzu92hXcyXgh7mX2sj6sRBSwNygxMn4ZsZJklle38/LynvtrzT72BOZjghS+Vh l553uDInjSJ3IBrXnClPoyObcu50cmsBBgsK39FzU+MF9mcCHmkHQiT52zM6ZW3M wwY1OfcZk3XaT7akcVu6 =CndB -----END PGP SIGNATURE----- Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux Pull device tree updates from Grant Likely: "This branch contains the following changes: - Removal of CONFIG_OF_DEVICE, it is always enabled by CONFIG_OF - Remove #ifdef from linux/of_platform.h to increase compiler syntax coverage - Bug fix for address decoding on Bimini and js2x powerpc platforms. - miscellaneous binding changes One note on the above. The binding changes going in from all kinds of different trees has gotten rather out of hand. I picked up some during this cycle, but even going though my tree isn't a great fit. Ian Campbell has prototyped splitting the bindings and .dtb files into a separate repository. The plan is to migrate to using that sometime in the next few kernel releases which should get rid of a lot of the churn on binding docs and .dts files" * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux: of: Fix address decoding on Bimini and js2x machines of: remove CONFIG_OF_DEVICE usb: chipidea: depend on CONFIG_OF instead of CONFIG_OF_DEVICE of: remove of_platform_driver ibmebus: convert of_platform_driver to platform_driver driver core: move to_platform_driver to platform_device.h mfd: DT bindings for the palmas family MFD ARM: dts: omap3-devkit8000: fix NAND memory binding of/base: fix typos of: remove #ifdef from linux/of_platform.h	2013-07-04 15:51:45 -07:00
Linus Torvalds	65b97fb730	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "This is the powerpc changes for the 3.11 merge window. In addition to the usual bug fixes and small updates, the main highlights are: - Support for transparent huge pages by Aneesh Kumar for 64-bit server processors. This allows the use of 16M pages as transparent huge pages on kernels compiled with a 64K base page size. - Base VFIO support for KVM on power by Alexey Kardashevskiy - Wiring up of our nvram to the pstore infrastructure, including putting compressed oopses in there by Aruna Balakrishnaiah - Move, rework and improve our "EEH" (basically PCI error handling and recovery) infrastructure. It is no longer specific to pseries but is now usable by the new "powernv" platform as well (no hypervisor) by Gavin Shan. - I fixed some bugs in our math-emu instruction decoding and made it usable to emulate some optional FP instructions on processors with hard FP that lack them (such as fsqrt on Freescale embedded processors). - Support for Power8 "Event Based Branch" facility by Michael Ellerman. This facility allows what is basically "userspace interrupts" for performance monitor events. - A bunch of Transactional Memory vs. Signals bug fixes and HW breakpoint/watchpoint fixes by Michael Neuling. And more ... I appologize in advance if I've failed to highlight something that somebody deemed worth it." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits) pstore: Add hsize argument in write_buf call of pstore_ftrace_call powerpc/fsl: add MPIC timer wakeup support powerpc/mpic: create mpic subsystem object powerpc/mpic: add global timer support powerpc/mpic: add irq_set_wake support powerpc/85xx: enable coreint for all the 64bit boards powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig powerpc/mpic: Add get_version API both for internal and external use powerpc: Handle both new style and old style reserve maps powerpc/hw_brk: Fix off by one error when validating DAWR region end powerpc/pseries: Support compression of oops text via pstore powerpc/pseries: Re-organise the oops compression code pstore: Pass header size in the pstore write callback powerpc/powernv: Fix iommu initialization again powerpc/pseries: Inform the hypervisor we are using EBB regs powerpc/perf: Add power8 EBB support powerpc/perf: Core EBB support for 64-bit book3s powerpc/perf: Drop MMCRA from thread_struct powerpc/perf: Don't enable if we have zero events ...	2013-07-04 10:29:23 -07:00
Linus Torvalds	7f0ef0267e	Merge branch 'akpm' (updates from Andrew Morton) Merge first patch-bomb from Andrew Morton: - various misc bits - I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been distracted. There has been quite a bit of activity. - About half the MM queue - Some backlight bits - Various lib/ updates - checkpatch updates - zillions more little rtc patches - ptrace - signals - exec - procfs - rapidio - nbd - aoe - pps - memstick - tools/testing/selftests updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (445 commits) tools/testing/selftests: don't assume the x bit is set on scripts selftests: add .gitignore for kcmp selftests: fix clean target in kcmp Makefile selftests: add .gitignore for vm selftests: add hugetlbfstest self-test: fix make clean selftests: exit 1 on failure kernel/resource.c: remove the unneeded assignment in function __find_resource aio: fix wrong comment in aio_complete() drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode drivers/memstick/host/r592.c: convert to module_pci_driver drivers/memstick/host/jmb38x_ms: convert to module_pci_driver pps-gpio: add device-tree binding and support drivers/pps/clients/pps-gpio.c: convert to module_platform_driver drivers/pps/clients/pps-gpio.c: convert to devm_* helpers drivers/parport/share.c: use kzalloc Documentation/accounting/getdelays.c: avoid strncpy in accounting tool aoe: update internal version number to v83 aoe: update copyright date aoe: perform I/O completions in parallel ...	2013-07-03 17:12:13 -07:00
Linus Torvalds	862f001254	PCI changes for the v3.11 merge window: PCI device hotplug - Add pci_alloc_dev() interface (Gu Zheng) - Add pci_bus_get()/put() for reference counting (Jiang Liu) - Fix SR-IOV reference count issues (Jiang Liu) - Remove unused acpi_pci_roots list (Jiang Liu) MSI - Conserve interrupt resources on x86 (Alexander Gordeev) AER - Force fatal severity when component has been reset (Betty Dall) - Reset link below Root Port as well as Downstream Port (Betty Dall) - Fix "Firmware first" flag setting (Bjorn Helgaas) - Don't parse HEST for non-PCIe devices (Bjorn Helgaas) ASPM - Warn when we can't disable ASPM as driver requests (Bjorn Helgaas) Miscellaneous - Add CircuitCo PCI IDs (Darren Hart) - Add AMD CZ SATA and SMBus PCI IDs (Shane Huang) - Work around Ivytown NTB BAR size issue (Jon Mason) - Detect invalid initial BAR values (Kevin Hao) - Add pcibios_release_device() (Sebastian Ott) - Fix powerpc & sparc PCI_UNKNOWN power state usage (Bjorn Helgaas) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJR0dhDAAoJEFmIoMA60/r8AUoP/RrKOXC2GGZCqUIKjUyxaCU+ NXwaNhjfRuge5lZHE8fUAnYVveFTv0iNo8if/md866/pS4il3vxaMWRhZrBddXqe juyxPUGaOb5NmI2C+g5ebQ1xHhnOU6kWrgQ5kQk5GmJdm6BpiWDCFaalyioYj27v FoN/25IG+5EtJjP6kRdQGFZq+RYOqlBfQp4fdFmY5bDsQiCLREH6YWHeNSkH+t1I Eh84WESqGzgaZyCb9QKM2AcU/HMKLux4VXAYp9idVr3tH1j9b/klQI7xNW7sPnkY LIzKTcfF89iXkhxc7zrF0O/n5rC5Cp7LQpiFMV6yCT3w25yWpq9itOwqcZ/nfCv6 fje8P1B2lwGrizkwKKLcosTzWkJewvfLkVye90WS3g0i3zlijF4pfEiw3a2ujA91 MP9/JmX+ZZ5QeGyPuFmYJyMlInH4vtSdegl9jtaeuX4cOnuMP+Ouxnxc+mH2bOfl Z5/K1OSCYLfb27uWM7od2lgb+GFHLMP+RMy073h0ZMpDvM6EnZy5iu1zU9+yJO4S 8/aRhBz4h+YEBinnXOJvHzMfu3wQQ7UvXZqEspgsug2Z5xHvxhMLhrJQgpVUSdsW Nrdm1dNdACV/cvt/lWzUE7SmUaOua/r/cVmViF2ryeRET7t65in+5NHXmXP8ET+r 1WA7pbykfegC9uY84PaK =X3Lo -----END PGP SIGNATURE----- Merge tag 'pci-v3.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "PCI device hotplug - Add pci_alloc_dev() interface (Gu Zheng) - Add pci_bus_get()/put() for reference counting (Jiang Liu) - Fix SR-IOV reference count issues (Jiang Liu) - Remove unused acpi_pci_roots list (Jiang Liu) MSI - Conserve interrupt resources on x86 (Alexander Gordeev) AER - Force fatal severity when component has been reset (Betty Dall) - Reset link below Root Port as well as Downstream Port (Betty Dall) - Fix "Firmware first" flag setting (Bjorn Helgaas) - Don't parse HEST for non-PCIe devices (Bjorn Helgaas) ASPM - Warn when we can't disable ASPM as driver requests (Bjorn Helgaas) Miscellaneous - Add CircuitCo PCI IDs (Darren Hart) - Add AMD CZ SATA and SMBus PCI IDs (Shane Huang) - Work around Ivytown NTB BAR size issue (Jon Mason) - Detect invalid initial BAR values (Kevin Hao) - Add pcibios_release_device() (Sebastian Ott) - Fix powerpc & sparc PCI_UNKNOWN power state usage (Bjorn Helgaas)" * tag 'pci-v3.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (51 commits) MAINTAINERS: Add ACPI folks for ACPI-related things under drivers/pci PCI: Add CircuitCo vendor ID and subsystem ID PCI: Use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM) PCI: Return early on allocation failures to unindent mainline code PCI: Simplify IOV implementation and fix reference count races PCI: Drop redundant setting of bus->is_added in virtfn_add_bus() unicore32/PCI: Remove redundant call of pci_bus_add_devices() m68k/PCI: Remove redundant call of pci_bus_add_devices() PCI / ACPI / PM: Use correct power state strings in messages PCI: Fix comment typo for pcie_pme_remove() PCI: Rename pci_release_bus_bridge_dev() to pci_release_host_bridge_dev() PCI: Fix refcount issue in pci_create_root_bus() error recovery path ia64/PCI: Clean up pci_scan_root_bus() usage PCI/AER: Reset link for devices below Root Port or Downstream Port ACPI / APEI: Force fatal AER severity when component has been reset PCI/AER: Remove "extern" from function declarations PCI/AER: Move AER severity defines to aer.h PCI/AER: Set dev->__aer_firmware_first only for matching devices PCI/AER: Factor out HEST device type matching PCI/AER: Don't parse HEST table for non-PCIe devices ...	2013-07-03 16:31:35 -07:00
Zhang Yanfei	427fcd23b5	powerpc: Remove savemaxmem parameter setup saved_max_pfn is used to know the amount of memory that the previous kernel used. And for powerpc, we set saved_max_pfn by passing the kernel commandline parameter "savemaxmem=". The only user of saved_max_pfn in powerpc is read_oldmem interface. Since we have removed read_oldmem, we don't need this parameter anymore. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave@sr71.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:08:03 -07:00
Jiang Liu	dbe67df4ba	mm: enhance free_reserved_area() to support poisoning memory with zero Address more review comments from last round of code review. 1) Enhance free_reserved_area() to support poisoning freed memory with pattern '0'. This could be used to get rid of poison_init_mem() on ARM64. 2) A previous patch has disabled memory poison for initmem on s390 by mistake, so restore to the original behavior. 3) Remove redundant PAGE_ALIGN() when calling free_reserved_area(). Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: <sworddragon2@aol.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:32 -07:00
Jiang Liu	11199692d8	mm: change signature of free_reserved_area() to fix building warnings Change signature of free_reserved_area() according to Russell King's suggestion to fix following build warnings: arch/arm/mm/init.c: In function 'mem_init': arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default] free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL); ^ In file included from include/linux/mman.h:4:0, from arch/arm/mm/init.c:15: include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void ' extern unsigned long free_reserved_area(unsigned long start, unsigned long end, mm/page_alloc.c: In function 'free_reserved_area': >> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default] In file included from arch/mips/include/asm/page.h:49:0, from include/linux/mmzone.h:20, from include/linux/gfp.h:4, from include/linux/mm.h:8, from mm/page_alloc.c:18: arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void ' but argument is of type 'long unsigned int' mm/page_alloc.c: In function 'free_area_init_nodes': mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds] Also address some minor code review comments. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: <sworddragon2@aol.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:32 -07:00
Linus Torvalds	790eac5640	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second set of VFS changes from Al Viro: "Assorted f_pos race fixes, making do_splice_direct() safe to call with i_mutex on parent, O_TMPFILE support, Jeff's locks.c series, ->d_hash/->d_compare calling conventions changes from Linus, misc stuff all over the place." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) Document ->tmpfile() ext4: ->tmpfile() support vfs: export lseek_execute() to modules lseek_execute() doesn't need an inode passed to it block_dev: switch to fixed_size_llseek() cpqphp_sysfs: switch to fixed_size_llseek() tile-srom: switch to fixed_size_llseek() proc_powerpc: switch to fixed_size_llseek() ubi/cdev: switch to fixed_size_llseek() pci/proc: switch to fixed_size_llseek() isapnp: switch to fixed_size_llseek() lpfc: switch to fixed_size_llseek() locks: give the blocked_hash its own spinlock locks: add a new "lm_owner_key" lock operation locks: turn the blocked_list into a hashtable locks: convert fl_link to a hlist_node locks: avoid taking global lock if possible when waking up blocked waiters locks: protect most of the file_lock handling with i_lock locks: encapsulate the fl_link list handling locks: make "added" in __posix_lock_file a bool ...	2013-07-03 09:10:19 -07:00
Benjamin Herrenschmidt	c039e3a8dd	powerpc: Handle both new style and old style reserve maps When Jeremy introduced the new device-tree based reserve map, he made the code in early_reserve_mem_dt() bail out if it found one, thus not reserving the initrd nor processing the old style map. I hit problems with variants of kexec that didn't put the initrd in the new style map either. While these could/will be fixed, I believe we should be safe here and rather reserve more than not enough. We could have a firmware passing stuff via the new style map, and in the middle, a kexec that knows nothing about it and adding other things to the old style map. I don't see a big issue with processing both and reserving everything that needs to be. memblock_reserve() supports overlaps fine these days. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-02 08:20:49 +10:00
Michael Neuling	e2a800beac	powerpc/hw_brk: Fix off by one error when validating DAWR region end The Data Address Watchpoint Register (DAWR) on POWER8 can take a 512 byte range but this range must not cross a 512 byte boundary. Unfortunately we were off by one when calculating the end of the region, hence we were not allowing some breakpoint regions which were actually valid. This fixes this error. Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # 3.9+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-02 08:20:49 +10:00
Benjamin Herrenschmidt	24a72acac1	Linux 3.10 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ= =Ob6Z -----END PGP SIGNATURE----- Merge tag 'v3.10' into next Merge 3.10 in order to get some of the last minute powerpc changes, resolve conflicts and add additional fixes on top of them.	2013-07-01 17:57:25 +10:00
Michael Ellerman	330a1eb777	powerpc/perf: Core EBB support for 64-bit book3s Add support for EBB (Event Based Branches) on 64-bit book3s. See the included documentation for more details. EBBs are a feature which allows the hardware to branch directly to a specified user space address when a PMU event overflows. This can be used by programs for self-monitoring with no kernel involvement in the inner loop. Most of the logic is in the generic book3s code, primarily to avoid a proliferation of PMU callbacks. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:50:10 +10:00
Michael Ellerman	2ac138ca21	powerpc/perf: Drop MMCRA from thread_struct In commit `59affcd` "Context switch more PMU related SPRs" I added more PMU SPRs to thread_struct, later modified in commit b11ae95. To add insult to injury it turns out we don't need to switch MMCRA as it's only user readable, and the value is recomputed by the PMU code. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:50:07 +10:00
Michael Ellerman	b14b6260ef	powerpc: Wire up the HV facility unavailable exception Similar to the facility unavailble exception, except the facilities are controlled by HFSCR. Adapt the facility_unavailable_exception() so it can be called for either the regular or Hypervisor facility unavailable exceptions. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:47 +10:00
Michael Ellerman	021424a1fc	powerpc: Rename and flesh out the facility unavailable exception handler The exception at 0xf60 is not the TM (Transactional Memory) unavailable exception, it is the "Facility Unavailable Exception", rename it as such. Flesh out the handler to acknowledge the fact that it can be called for many reasons, one of which is TM being unavailable. Use STD_EXCEPTION_COMMON() for the exception body, for some reason we had it open-coded, I've checked the generated code is identical. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.10] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:44 +10:00
Michael Ellerman	c9f69518e5	powerpc: Remove KVMTEST from RELON exception handlers KVMTEST is a macro which checks whether we are taking an exception from guest context, if so we branch out of line and eventually call into the KVM code to handle the switch. When running real guests on bare metal (HV KVM) the hardware ensures that we never take a relocation on exception when transitioning from guest to host. For PR KVM we disable relocation on exceptions ourself in kvmppc_core_init_vm(), as of commit `a413f47` "Disable relocation on exceptions whenever PR KVM is active". So convert all the RELON macros to use NOTEST, and drop the remaining KVM_HANDLER() definitions we have for 0xe40 and 0xe80. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.9+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:40 +10:00
Michael Ellerman	1d567cb4bd	powerpc: Remove unreachable relocation on exception handlers We have relocation on exception handlers defined for h_data_storage and h_instr_storage. However we will never take relocation on exceptions for these because they can only come from a guest, and we never take relocation on exceptions when we transition from guest to host. We also have a handler for hmi_exception (Hypervisor Maintenance) which is defined in the architecture to never be delivered with relocation on, see see v2.07 Book III-S section 6.5. So remove the handlers, leaving a branch to self just to be double extra paranoid. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> CC: <stable@vger.kernel.org> [v3.9+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:37 +10:00
Chen Gang	8246aca705	powerpc/smp: Section mismatch from smp_release_cpus to __initdata spinning_secondaries the smp_release_cpus is a normal funciton and called in normal environments, but it calls the __initdata spinning_secondaries. need modify spinning_secondaries to match smp_release_cpus. the related warning: (the linker report boot_paca.33377, but it should be spinning_secondaries) ----------------------------------------------------------------------------- WARNING: arch/powerpc/kernel/built-in.o(.text+0x23176): Section mismatch in reference from the function .smp_release_cpus() to the variable .init.data:boot_paca.33377 The function .smp_release_cpus() references the variable __initdata boot_paca.33377. This is often because .smp_release_cpus lacks a __initdata annotation or the annotation of boot_paca.33377 is wrong. WARNING: arch/powerpc/kernel/built-in.o(.text+0x231fe): Section mismatch in reference from the function .smp_release_cpus() to the variable .init.data:boot_paca.33377 The function .smp_release_cpus() references the variable __initdata boot_paca.33377. This is often because .smp_release_cpus lacks a __initdata annotation or the annotation of boot_paca.33377 is wrong. ----------------------------------------------------------------------------- Signed-off-by: Chen Gang <gang.chen@asianux.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:49:27 +10:00
Chen Gang	7029705a9d	powerpc/nvram64: Need return the related error code on failure occurs When error occurs, need return the related error code to let upper caller know about it. ppc_md.nvram_size() can return the error code (e.g. core99_nvram_size() in 'arch/powerpc/platforms/powermac/nvram.c'). Also set ret value when only need it, so can save structions for normal cases. Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:46:56 +10:00
Li Zhong	cce606feb4	powerpc: Set cpu sibling mask before online cpu It seems following race is possible: cpu0 cpux smp_init->cpu_up->_cpu_up __cpu_up kick_cpu(1) ------------------------------------------------------------------------- waiting online ... ... notify CPU_STARTING set cpux active set cpux online ------------------------------------------------------------------------- finish waiting online ... sched_init_smp init_sched_domains(cpu_active_mask) build_sched_domains set cpux sibling info ------------------------------------------------------------------------- Execution of cpu0 and cpux could be concurrent between two separator lines. So if the cpux sibling information was set too late (normally impossible, but could be triggered by adding some delay in start_secondary, after setting cpu online), build_sched_domains() running on cpu0 might see cpux active, with an empty sibling mask, then cause some bad address accessing like following: [ 0.099855] Unable to handle kernel paging request for data at address 0xc00000038518078f [ 0.099868] Faulting instruction address: 0xc0000000000b7a64 [ 0.099883] Oops: Kernel access of bad area, sig: 11 [#1] [ 0.099895] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries [ 0.099922] Modules linked in: [ 0.099940] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc1-00120-gb973425-dirty #16 [ 0.099956] task: c0000001fed80000 ti: c0000001fed7c000 task.ti: c0000001fed7c000 [ 0.099971] NIP: c0000000000b7a64 LR: c0000000000b7a40 CTR: c0000000000b4934 [ 0.099985] REGS: c0000001fed7f760 TRAP: 0300 Not tainted (3.10.0-rc1-00120-gb973425-dirty) [ 0.099997] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 24272828 XER: 20000003 [ 0.100045] SOFTE: 1 [ 0.100053] CFAR: c000000000445ee8 [ 0.100064] DAR: c00000038518078f, DSISR: 40000000 [ 0.100073] GPR00: 0000000000000080 c0000001fed7f9e0 c000000000c84d48 0000000000000010 GPR04: 0000000000000010 0000000000000000 c0000001fc55e090 0000000000000000 GPR08: ffffffffffffffff c000000000b80b30 c000000000c962d8 00000003845ffc5f GPR12: 0000000000000000 c00000000f33d000 c00000000000b9e4 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000000 GPR20: c000000000ccf750 0000000000000000 c000000000c94d48 c0000001fc504000 GPR24: c0000001fc504000 c0000001fecef848 c000000000c94d48 c000000000ccf000 GPR28: c0000001fc522090 0000000000000010 c0000001fecef848 c0000001fed7fae0 [ 0.100293] NIP [c0000000000b7a64] .get_group+0x84/0xc4 [ 0.100307] LR [c0000000000b7a40] .get_group+0x60/0xc4 [ 0.100318] Call Trace: [ 0.100332] [c0000001fed7f9e0] [c0000000000dbce4] .lock_is_held+0xa8/0xd0 (unreliable) [ 0.100354] [c0000001fed7fa70] [c0000000000bf62c] .build_sched_domains+0x728/0xd14 [ 0.100375] [c0000001fed7fbe0] [c000000000af67bc] .sched_init_smp+0x4fc/0x654 [ 0.100394] [c0000001fed7fce0] [c000000000adce24] .kernel_init_freeable+0x17c/0x30c [ 0.100413] [c0000001fed7fdb0] [c00000000000ba08] .kernel_init+0x24/0x12c [ 0.100431] [c0000001fed7fe30] [c000000000009f74] .ret_from_kernel_thread+0x5c/0x68 [ 0.100445] Instruction dump: [ 0.100456] 38800010 38a00000 4838e3f5 60000000 7c6307b4 2fbf0000 419e0040 3d220001 [ 0.100496] 78601f24 39491590 e93e0008 7d6a002a <7d69582a> f97f0000 7d4a002a e93e0010 [ 0.100559] ---[ end trace 31fd0ba7d8756001 ]--- This patch tries to move the sibling maps updating before notify_cpu_starting() and cpu online, and a write barrier there to make sure sibling maps are updated before active and online mask. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:46:55 +10:00
Paul Gortmaker	061d19f279	powerpc: Delete __cpuinit usage from all users The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit `5e427ec2d0` ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. This removes all the powerpc uses of the __cpuinit macros. There are no __CPUINIT users in assembly files in powerpc. [1] https://lkml.org/lkml/2013/5/20/589 Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:10:36 +10:00
Joe Perches	cc293bf7a9	powerpc/idle: Convert use of typedef ctl_table to struct ctl_table This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:10:35 +10:00
Kevin Hao	348c2298a6	powerpc: Don't flush/invalidate the d/icache for an unknown relocation type For an unknown relocation type since the value of r4 is just the 8bit relocation type, the sum of r4 and r7 may yield an invalid memory address. For example: In normal case: r4 = c00xxxxx r7 = 40000000 r4 + r7 = 000xxxxx For an unknown relocation type: r4 = 000000xx r7 = 40000000 r4 + r7 = 400000xx 400000xx is an invalid memory address for a board which has just 512M memory. And for operations such as dcbst or icbi may cause bus error for an invalid memory address on some platforms and then cause the board reset. So we should skip the flush/invalidate the d/icache for an unknown relocation type. Signed-off-by: Kevin Hao <haokexin@gmail.com> Acked-by: Suzuki K. Poulose <suzuki@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:10:34 +10:00
Gavin Shan	eeb6361fdd	powerpc/eeh: Avoid build warnings The patch is for avoiding following build warnings: The function .pnv_pci_ioda_fixup() references the function __init .eeh_init(). This is often because .pnv_pci_ioda_fixup lacks a __init The function .pnv_pci_ioda_fixup() references the function __init .eeh_addr_cache_build(). This is often because .pnv_pci_ioda_fixup lacks a __init Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:10:33 +10:00
Gavin Shan	56ca4fde90	powerpc/eeh: Refactor the output message We needn't the the whole backtrace other than one-line message in the error reporting interrupt handler. For errors triggered by access PCI config space or MMIO, we replace "WARN(1, ...)" with pr_err() and dump_stack(). The patch also adds more output messages to indicate what EEH core is doing. Besides, some printk() are replaced with pr_warning(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:10:33 +10:00
Gavin Shan	88b6d14b2b	powerpc/eeh: Fix address catch for PowerNV On the PowerNV platform, the EEH address cache isn't built correctly because we skipped the EEH devices without binding PE. The patch fixes that. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:10:32 +10:00
Gavin Shan	652defed48	powerpc/eeh: Check PCIe link after reset After reset (e.g. complete reset) in order to bring the fenced PHB back, the PCIe link might not be ready yet. The patch intends to make sure the PCIe link is ready before accessing its subordinate PCI devices. The patch also fixes that wrong values restored to PCI_COMMAND register for PCI bridges. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:10:31 +10:00
Gavin Shan	c35ae1796b	powerpc/eeh: Don't collect PCI-CFG data on PHB When the PHB is fenced or dead, it's pointless to collect the data from PCI config space of subordinate PCI devices since it should return 0xFF's. The patch also fixes overwritten buffer while getting PCI config data. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-01 11:10:31 +10:00
Michael Neuling	090b9284d7	powerpc/tm: Clear MSR RI in non-recoverable TM code When we treclaim and trecheckpoint there's an unavoidable period when r1 will not be a valid kernel stack pointer. This patch clears the MSR recoverable interrupt (RI) bit over these regions to indicate we have an invalid kernel stack pointer. For treclaim, the region over which we clear MSR RI is larger than required to avoid the need for an extra costly mtmsrd. Thanks to Paulus for suggesting this change. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-30 15:49:43 +10:00
James Yang	80aa0fb494	powerpc: Fix string instr. emulation for 32-bit processes on ppc64 String instruction emulation would erroneously result in a segfault if the upper bits of the EA are set and is so high that it fails access check. Truncate the EA to 32 bits if the process is 32-bit. Signed-off-by: James Yang <James.Yang@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-30 15:49:40 +10:00
Guenter Roeck	7846de406f	powerpc/pci: Improve device hotplug initialization Commit `37f02195b` (powerpc/pci: fix PCI-e devices rescan issue on powerpc platform) fixes a problem with interrupt and DMA initialization on hot plugged devices. With this commit, interrupt and DMA initialization for hot plugged devices is handled in the pci device enable function. This approach has a couple of drawbacks. First, it creates two code paths for device initialization, one for hot plugged devices and another for devices known during the initial PCI scan. Second, the initialization code for hot plugged devices is only called when the device is enabled, ie typically in the probe function. Also, the platform specific setup code is called each time pci_enable_device() is called, not only once during device discovery, meaning it is actually called multiple times, once for devices discovered during the initial scan and again each time a driver is re-loaded. The visible result is that interrupt pins are only assigned to hot plugged devices when the device driver is loaded. Effectively this changes the PCI probe API, since pci_dev->irq and the device's dma configuration will now only be valid after pci_enable() was called at least once. A more subtle change is that platform specific PCI device setup is moved from device discovery into the driver's probe function, more specifically into the pci_enable_device() call. To fix the inconsistencies, add new function pcibios_add_device. Call pcibios_setup_device from pcibios_setup_bus_devices if device setup is not complete, and from pcibios_add_device if bus setup is complete. With this change, device setup code is moved back into device initialization, and called exactly once for both static and hot plugged devices. [ This also fixes a regression introduced by the above patch which causes dev->irq to be overwritten under some cirumstances after MSIs have been enabled for the device which leads to crashes due to the MSI core "hijacking" dev->irq to store the base MSI number and not the LSI. --BenH ] Cc: Yuanquan Chen <Yuanquan.Chen@freescale.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-30 08:46:46 +10:00
Al Viro	b33159b7d2	proc_powerpc: switch to fixed_size_llseek() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:57:50 +04:00
Gavin Shan	5459ae1431	powerpc/eeh: Use interruptible sleep in keehd To replace down() with down_interrutible() to avoid following warning: [c00000007ba7b710] [c000000000014410] .__switch_to+0x1b0/0x380 [c00000007ba7b7c0] [c0000000007b408c] .__schedule+0x3ec/0x970 [c00000007ba7ba50] [c0000000007b1f24] .schedule_timeout+0x1a4/0x2b0 [c00000007ba7bb30] [c0000000007b34a4] .__down+0xa4/0x104 [c00000007ba7bbf0] [c0000000000b9230] .down+0x60/0x70 [c00000007ba7bc80] [c0000000000336d0] .eeh_event_handler+0x70/0x190 [c00000007ba7bd30] [c0000000000b1a58] .kthread+0xe8/0xf0 [c00000007ba7be30] [c00000000000a05c] .ret_from_kernel_thread+0x5c/0x8 This also avoids keeping the load average up while doing nothing. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-25 17:24:41 +10:00
Gavin Shan	ef6a285773	powerpc/eeh: Remove eeh_mutex Originally, eeh_mutex was introduced to protect the PE hierarchy tree and the attached EEH devices because EEH core was possiblly running with multiple threads to access the PE hierarchy tree. However, we now have only one kthread in EEH core. So we needn't the eeh_mutex and just remove it. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-25 17:24:41 +10:00
Michael Neuling	540e07c67e	powerpc/hw_brk: Fix clearing of extraneous IRQ In `9422de3` "powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint registers" we changed the way we mark extraneous irqs with this: - info->extraneous_interrupt = !((bp->attr.bp_addr <= dar) && - (dar - bp->attr.bp_addr < bp->attr.bp_len)); + if (!((bp->attr.bp_addr <= dar) && + (dar - bp->attr.bp_addr < bp->attr.bp_len))) + info->type \|= HW_BRK_TYPE_EXTRANEOUS_IRQ; Unfortunately this is bogus as it never clears extraneous IRQ if it's already set. This correctly clears extraneous IRQ before possibly setting it. Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-25 17:24:40 +10:00
Michael Neuling	b0b0aa9c7f	powerpc/hw_brk: Fix setting of length for exact mode breakpoints The smallest match region for both the DABR and DAWR is 8 bytes, so the kernel needs to filter matches when users want to look at regions smaller than this. Currently we set the length of PPC_BREAKPOINT_MODE_EXACT breakpoints to 8. This is wrong as in exact mode we should only match on 1 address, hence the length should be 1. This ensures that the kernel will filter out any exact mode hardware breakpoint matches on any addresses other than the requested one. Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-25 17:24:39 +10:00
Aneesh Kumar K.V	12bc9f6fc1	powerpc: Replace find_linux_pte with find_linux_pte_or_hugepte Replace find_linux_pte with find_linux_pte_or_hugepte and explicitly document why we don't need to handle transparent hugepages at callsites. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-21 16:01:54 +10:00
Gavin Shan	b95cd2cd44	powerpc/eeh: Allow to check fenced PHB proactively It's meaningless to handle frozen PE if we already had fenced PHB. The patch intends to check the PHB state before checking PE. If the PHB has been put into fenced state, we need take care of that firstly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:06:53 +10:00
Gavin Shan	8a6b1bc70d	powerpc/eeh: EEH core to handle special event On PowerNV platform, the EEH event caused by interrupt won't have binding PE. The patch enables EEH core to handle the special event. To avoid the current logic we have, The eeh_handle_event() is renamed to eeh_handle_normal_event(), and the eeh_handle_special_event() is introduced. The function eeh_handle_event() dispatches to above two functions according to the input parameter. Besides, new backend "next_error" added to eeh_ops and it's expected to have following return values: 4 - Dead IOC 3 - Dead PHB 2 - Fenced PHB 1 - Frozen PE 0 - No error found Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:06:14 +10:00
Gavin Shan	4907581dc2	powerpc/eeh: Export confirm_error_lock An EEH event is created and queued to the event queue for each ingress EEH error. When there're mutiple EEH errors, we need serialize the process to keep consistent PE state (flags). The spinlock "confirm_error_lock" was introduced for the purpose. We'll inject EEH event upon error reporting interrupts on PowerNV platform. So we export the spinlock for that to use for consistent PE state. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:06:11 +10:00
Gavin Shan	9986659534	powerpc/eeh: Allow to purge EEH events On PowerNV platform, we might run into the situation where subsequent events are duplicated events of former one, which is being processed. For the case, we need the function implemented by the patch to purge EEH events accordingly. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:06:07 +10:00
Gavin Shan	5a71978e4b	powerpc/eeh: Trace time on first error for PE We're not expecting that one specific PE got frozen for over 5 times in last hour. Otherwise, the PE will be removed from the system upon newly coming EEH errors. The patch introduces time stamp to trace the first error on specific PE in last hour and function to update that accordingly. Besides, the time stamp is recovered during PE hotplug path as we did for frozen count. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:06:04 +10:00
Gavin Shan	c86085580d	powerpc/eeh: Single kthread to handle events We possiblly have multiple kthreads running for multiple EEH errors (events) and use one spinlock to make the process of handling those EEH events serialized. That's unnecessary and the patch creates only one kthread, which is started during EEH core initialization time in eeh_init(). A new semaphore introduced to count the number of existing EEH events in the queue and the kthread waiting on the semaphore. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:06:01 +10:00
Gavin Shan	26a74850b3	powerpc/eeh: Delay EEH probe during hotplug While doing EEH recovery, the PCI devices of the problematic PE should be removed and then added to the system again. During the so-called hotplug event, the PCI devices of the problematic PE will be probed through early/late phase. We would delay EEH probe on late point for PowerNV platform since the PCI device isn't available in early phase. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:58 +10:00
Gavin Shan	326a98ea93	powerpc/eeh: Refactor eeh_reset_pe_once() We shouldn't check that the returned PE status is exactly equal to (EEH_STATE_MMIO_ACTIVE \| EEH_STATE_DMA_ACTIVE) but instead only check that they are both set. [benh: changelog] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:54 +10:00
Gavin Shan	21fd21f590	powerpc/eeh: EEH post initialization operation The patch adds new EEH operation post_init. It's used to notify the platform that EEH core has completed the EEH probe. By that, PowerNV platform starts to use the services supplied by EEH functionality. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:51 +10:00
Gavin Shan	51fb5f5632	powerpc/eeh: Make eeh_init() public For EEH on PowerNV platform, we will do EEH probe based on the real PCI devices. The PCI devices are available after PCI probe. So we have to call eeh_init() explicitly on PowerNV platform after PCI probe. The patch also does EEH probe for PowerNV platform in eeh_init(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:48 +10:00
Gavin Shan	8cdb283371	powerpc/eeh: Trace PCI bus from PE There're several types of PEs can be supported for now: PHB, Bus and Device dependent PE. For PCI bus dependent PE, tracing the corresponding PCI bus from PE (struct eeh_pe) would make the code more efficient. The patch also enables the retrieval of PCI bus based on the PCI bus dependent PE. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:45 +10:00
Gavin Shan	0156680854	powerpc/eeh: Make eeh_pe_get() public While processing EEH event interrupt from P7IOC, we need function to retrieve the PE according to the indicated EEH device. The patch makes function eeh_pe_get() public so that other source files can call it for that purpose. Also, the patch fixes referring to wrong BDF (Bus/Device/Function) address while searching PE in function __eeh_pe_get(). Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:41 +10:00
Gavin Shan	9ff67433ce	powerpc/eeh: Make eeh_phb_pe_get() public One of the possible cases indicated by P7IOC interrupt is fenced PHB. For that case, we need fetch the PE corresponding to the PHB and disable the PHB and all subordinate PCI buses/devices, recover from the fenced state and eventually enable the whole PHB. We need one function to fetch the PHB PE outside eeh_pe.c and the patch is going to make eeh_phb_pe_get() public for that purpose. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:38 +10:00
Gavin Shan	317f06de78	powerpc/eeh: Move common part to kernel directory The patch moves the common part of EEH core into arch/powerpc/kernel directory so that we needn't PPC_PSERIES while compiling POWERNV platform: * Move the EEH common part into arch/powerpc/kernel * Move the functions for PCI hotplug from pSeries platform to arch/powerpc/kernel/pci-hotplug.c * Move CONFIG_EEH from arch/powerpc/platforms/pseries/Kconfig to arch/powerpc/platforms/Kconfig * Adjust makefile accordingly Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:35 +10:00
Michael Neuling	87b4e5393a	powerpc/tm: Fix return of active 64bit signals Currently we only restore signals which are transactionally suspended but it's possible that the transaction can be restored even when it's active. Most likely this will result in a transactional rollback by the hardware as the transaction will have been doomed by an earlier treclaim. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes assumptions based on having software rollback. This changes the signal return code to always restore both contexts on 64 bit signal return. It also ensures that the MSR TM bits are properly restored from the signal context which they are not currently. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:28 +10:00
Michael Neuling	55e4341850	powerpc/tm: Fix return of 32bit rt signals to active transactions Currently we only restore signals which are transactionally suspended but it's possible that the transaction can be restored even when it's active. Most likely this will result in a transactional rollback by the hardware as the transaction will have been doomed by an earlier treclaim. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes assumptions based on having software rollback. This changes the signal return code to always restore both contexts on 32 bit rt signal return. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:25 +10:00
Michael Neuling	2c27a18f87	powerpc/tm: Fix restoration of MSR on 32bit signal return Currently we clear out the MSR TM bits on signal return assuming that the signal should never return to an active transaction. This is bogus as the user may do this. It's most likely the transaction will be doomed due to a treclaim but that's a problem for the HW not the kernel. The current code is a legacy of earlier kernel implementations which did software rollback of active transactions in the kernel. That code has now gone but we didn't correctly fix up this part of the signals code which still makes the assumption that it must be returning to a suspended transaction. This pulls out both MSR TM bits from the user supplied context rather than just setting TM suspend. We pull out only the bits needed to ensure the user can't do anything dangerous to the MSR. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:22 +10:00
Michael Neuling	fee5545071	powerpc/tm: Fix 32 bit non-rt signals Currently sys_sigreturn() is TM unaware. Therefore, if we take a 32 bit signal without SIGINFO (non RT) inside a transaction, on signal return we don't restore the signal frame correctly. This checks if the signal frame being restoring is an active transaction, and if so, it copies the additional state to ptregs so it can be restored. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:18 +10:00
Michael Neuling	1d25f11fdb	powerpc/tm: Fix writing top half of MSR on 32 bit signals The MSR TM controls are in the top 32 bits of the MSR hence on 32 bit signals, we stick the top half of the MSR in the checkpointed signal context so that the user can access it. Unfortunately, we don't currently write anything to the checkpointed signal context when coming in a from a non transactional process and hence the top MSR bits can contain junk. This updates the 32 bit signal handling code to always write something to the top MSR bits so that users know if the process is transactional or not and the kernel can use it on signal return. Signed-off-by: Michael Neuling <mikey@neuling.org> cc: stable@vger.kernel.org (v3.9+) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:15 +10:00
Benjamin Herrenschmidt	968219fa33	powerpc/8xx: Remove 8xx specific "minimal FPU emulation" This is duplicated code from math-emu and implements such a small subset of the FPU (load/stores/fmr) that it's essentially pointless nowdays. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:12 +10:00
Benjamin Herrenschmidt	4e63f8edfe	powerpc/math-emu: Allow math-emu to be used for HW FPU (Including 64-bit ones) This allow SW emulation by the kernel of optional instructions such as fsqrt which aren't implemented on some processors, and thus fixes some Fedora 19 issues such as Anaconda since the compiler is set to generate those by default on 64-bit. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:05:09 +10:00
Bharat Bhushan	13d543cd79	powerpc: Restore dbcr0 on user space exit On BookE (Branch taken + Single Step) is as same as Branch Taken on BookS and in Linux we simulate BookS behavior for BookE as well. When doing so, in Branch taken handling we want to set DBCR0_IC but we update the current->thread->dbcr0 and not DBCR0. Now on 64bit the current->thread.dbcr0 (and other debug registers) is synchronized ONLY on context switch flow. But after handling Branch taken in debug exception if we return back to user space without context switch then single stepping change (DBCR0_ICMP) does not get written in h/w DBCR0 and Instruction Complete exception does not happen. This fixes using ptrace reliably on BookE-PowerPC lmbench latency test (lat_syscall) Results are (they varies a little on each run) 1) ./lat_syscall <action> /dev/shm/uImage action: Open read write stat fstat null Before: 3.8618 0.2017 0.2851 1.6789 0.2256 0.0856 After: 3.8580 0.2017 0.2851 1.6955 0.2255 0.0856 1) ./lat_syscall -P 2 -N 10 <action> /dev/shm/uImage action: Open read write stat fstat null Before: 4.1388 0.2238 0.3066 1.7106 0.2256 0.0856 After: 4.1413 0.2236 0.3062 1.7107 0.2256 0.0856 [ Slightly modified to avoid extra branch in the fast path on Book3S and fix build on all non-BookE 64-bit -- BenH ] Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 17:04:19 +10:00
Alexey Kardashevskiy	4e13c1ac6b	powerpc/vfio: Enable on PowerNV platform This initializes IOMMU groups based on the IOMMU configuration discovered during the PCI scan on POWERNV (POWER non virtualized) platform. The IOMMU groups are to be used later by the VFIO driver, which is used for PCI pass through. It also implements an API for mapping/unmapping pages for guest PCI drivers and providing DMA window properties. This API is going to be used later by QEMU-VFIO to handle h_put_tce hypercalls from the KVM guest. The iommu_put_tce_user_mode() does only a single page mapping as an API for adding many mappings at once is going to be added later. Although this driver has been tested only on the POWERNV platform, it should work on any platform which supports TCE tables. As h_put_tce hypercall is received by the host kernel and processed by the QEMU (what involves calling the host kernel again), performance is not the best - circa 220MB/s on 10Gb ethernet network. To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config option and configure VFIO as required. Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 16:55:14 +10:00
Alistair Popple	071df9422a	powerpc: Add a configuration option for early BootX/OpenFirmware debug Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 16:55:12 +10:00
Jeremy Kerr	0962e8004e	powerpc/prom: Scan reserved-ranges node for memory reservations Based on benh's proposal at https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-September/101237.html, this change provides support for reserving memory from the reserved-ranges node at the root of the device tree. We just call memblock_reserve on these ranges for now. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 16:55:11 +10:00
Kevin Hao	3139b0a797	powerpc: Remove the unneeded trigger of decrementer interrupt in decrementer_check_overflow Previously in order to handle the edge sensitive decrementers, we choose to set the decrementer to 1 to trigger a decrementer interrupt when re-enabling interrupts. But with the rework of the lazy EE, we would replay the decrementer interrupt when re-enabling interrupts if a decrementer interrupt occurs with irq soft-disabled. So there is no need to trigger a decrementer interrupt in this case any more. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 16:55:10 +10:00
Suzuki K. Poulose	35fd219a26	powerpc: Move the single step enable code to a generic path This patch moves the single step enable code used by kprobe to a generic routine header so that, it can be re-used by other code, in this case, uprobes. No functional changes. Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Cc: Ananth N Mavinakaynahalli <ananth@in.ibm.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@ozlabs.org Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 16:55:09 +10:00
Suzuki K. Poulose	85f395c5b0	powerpc/kprobes: Do not disable External interrupts during single step External/Decrement exceptions have lower priority than the Debug Exception. So, we don't have to disable the External interrupts before a single step. However, on BookE, Critical Input Exception(CE) has higher priority than a Debug Exception. Hence we mask them. Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Ananth N Mavinakaynahalli <ananth@in.ibm.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@ozlabs.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-20 16:55:09 +10:00
Benjamin Herrenschmidt	230b303479	powerpc: Fix missing/delayed calls to irq_work When replaying interrupts (as a result of the interrupt occurring while soft-disabled), in the case of the decrementer, we are exclusively testing for a pending timer target. However we also use decrementer interrupts to trigger the new "irq_work", which in this case would be missed. This change the logic to force a replay in both cases of a timer boundary reached and a decrementer interrupt having actually occurred while disabled. The former test is still useful to catch cases where a CPU having been hard-disabled for a long time completely misses the interrupt due to a decrementer rollover. CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Steven Rostedt <rostedt@goodmis.org>	2013-06-15 12:33:30 +10:00
Paul Mackerras	bf593907f7	powerpc: Fix emulation of illegal instructions on PowerNV platform Normally, the kernel emulates a few instructions that are unimplemented on some processors (e.g. the old dcba instruction), or privileged (e.g. mfpvr). The emulation of unimplemented instructions is currently not working on the PowerNV platform. The reason is that on these machines, unimplemented and illegal instructions cause a hypervisor emulation assist interrupt, rather than a program interrupt as on older CPUs. Our vector for the emulation assist interrupt just calls program_check_exception() directly, without setting the bit in SRR1 that indicates an illegal instruction interrupt. This fixes it by making the emulation assist interrupt set that bit before calling program_check_interrupt(). With this, old programs that use no-longer implemented instructions such as dcba now work again. CC: <stable@vger.kernel.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-15 12:24:11 +10:00
Michael Ellerman	0e37739b1c	powerpc: Fix stack overflow crash in resume_kernel when ftracing It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by `fe1952fc0a` "powerpc: Rework runlatch code" by myself --BenH ] CC: <stable@vger.kernel.org> [v3.4+] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-15 12:21:57 +10:00
Bjorn Helgaas	df58f46c0f	Merge branch 'pci/jiang-bus-lock-v3' into next * pci/jiang-bus-lock-v3: PCI: Return early on allocation failures to unindent mainline code PCI: Simplify IOV implementation and fix reference count races PCI: Drop redundant setting of bus->is_added in virtfn_add_bus() unicore32/PCI: Remove redundant call of pci_bus_add_devices() m68k/PCI: Remove redundant call of pci_bus_add_devices() PCI: Rename pci_release_bus_bridge_dev() to pci_release_host_bridge_dev() PCI: Fix refcount issue in pci_create_root_bus() error recovery path ia64/PCI: Clean up pci_scan_root_bus() usage PCI: Convert alloc_pci_dev(void) to pci_alloc_dev(bus) PCI: Introduce pci_alloc_dev(struct pci_bus*) to replace alloc_pci_dev() PCI: Introduce pci_bus_{get\|put}() to manage PCI bus reference count Conflicts: drivers/pci/probe.c	2013-06-14 17:47:46 -06:00
Rob Herring	c45640e4a9	ibmebus: convert of_platform_driver to platform_driver ibmebus is the last remaining user of of_platform_driver and the conversion to a regular platform driver is trivial. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Grant Likely <grant.likely@linaro.org>	2013-06-12 12:37:26 +01:00
Michael Ellerman	b11ae95100	powerpc: Partial revert of "Context switch more PMU related SPRs" In commit `59affcd` I added context switching of more PMU SPRs, because they are potentially exposed to userspace on Power8. However despite me being a smart arse in the commit message it's actually not correct. In particular it interacts badly with a global perf record. We will have to do something more complicated, but that will have to wait for 3.11. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-10 08:36:35 +10:00
Michael Neuling	82a9f16adc	powerpc/hw_breakpoints: Add DABRX cpu feature to fix 32-bit regression When introducing support for DABRX in `4474ef0`, we broke older 32-bit CPUs that don't have that register. Some CPUs have a DABR but not DABRX. Configuration are: - No 32bit CPUs have DABRX but some have DABR. - POWER4+ and below have the DABR but no DABRX. - 970 and POWER5 and above have DABR and DABRX. - POWER8 has DAWR, hence no DABRX. This introduces CPU_FTR_DABRX and sets it on appropriate CPUs. We use the top 64 bits for CPU FTR bits since only 64 bit CPUs have this. Processors that don't have the DABRX will still work as they will fall back to software filtering these breakpoints via perf_exclude_event(). Signed-off-by: Michael Neuling <mikey@neuling.org> Reported-by: "Gorelik, Jacob (335F)" <jacob.gorelik@jpl.nasa.gov> cc: stable@vger.kernel.org (v3.9 only) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-10 08:36:29 +10:00
Michael Neuling	fb0fce3e55	powerpc/power8: Update denormalization handler POWER8 can take a denormalisation exception on any VSX registers. This does the extra 32 VSX registers we don't currently handle. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-10 08:36:26 +10:00
Michael Neuling	d7c67fb1cf	powerpc/pseries: Simplify denormalization handler The following simplifies the denorm code by using macros to generate the long stream of almost identical instructions. This patch results in no changes to the output binary, but removes a lot of lines of code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-10 08:36:22 +10:00
Michael Neuling	6a60f9e7d8	powerpc/power8: Fix oprofile and perf In `2ac6f42` powerpc/cputable: Fix oprofile_cpu_type on power8 we broke all power8 hw events. This reverts this change and uses oprofile_type instead. Perf now works on POWER8 again and oprofile will revert to using timers on POWER8. Kudos to mpe this fix. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-10 08:36:19 +10:00
Kevin Hao	c5df457ffe	powerpc/pci: Check the bus address instead of resource address in pcibios_fixup_resources If a BAR has the value of 0, we would assume that it is unset yet and then mark the resource as unset and would reassign it later. But after commit `6c5705fe` (powerpc/PCI: get rid of device resource fixups) the pcibios_fixup_resources is invoked after the bus address was translated to linux resource. So the value of res->start is resource address. And since the resource and bus address may be different, we should translate it to the bus address before doing the check. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-10 08:36:13 +10:00
Gu Zheng	8b1fce04dc	PCI: Convert alloc_pci_dev(void) to pci_alloc_dev(bus) Use the new pci_alloc_dev(bus) to replace the existing using of alloc_pci_dev(void). [bhelgaas: drop pci_bus ref later in pci_release_dev()] Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Airlie <airlied@linux.ie> Cc: Neela Syam Kolli <megaraidlinux@lsi.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andrew Morton <akpm@linux-foundation.org>	2013-06-05 13:49:36 -06:00
Will Schmidt	badec11b64	powerpc/cputable: Fix typo on P7+ cputable entry Fix a typo in setting COMMON_USER2_POWER7 bits to .cpu_user_features2 cpu specs table. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-01 09:30:03 +10:00
Kevin Hao	858957ab1e	powerpc/pci: Remove the unused variables in pci_process_bridge_OF_ranges The codes which ever used these two variables have gone. Throw away them too. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-01 08:29:28 +10:00
Kevin Hao	2798389604	powerpc/pci: Remove the stale comments of pci_process_bridge_OF_ranges These comments already don't apply to the current code. So just remove them. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-01 08:29:28 +10:00
Priyanka Jain	f7b3367774	powerpc/32bit:Store temporary result in r0 instead of r8 Commit `a9c4e541ea` "powerpc/kprobe: Complete kprobe and migrate exception frame" introduced a regression: While returning from exception handling in case of PREEMPT enabled, _TIF_NEED_RESCHED bit is checked in TI_FLAGS (thread_info flag) of current task. Only if this bit is set, it should continue with the process of calling preempt_schedule_irq() to schedule highest priority task if available. Current code assumes that r8 contains TI_FLAGS and check this for _TIF_NEED_RESCHED, but as r8 is modified in the code which executes before this check, r8 no longer contains the expected TI_FLAGS information. As a result check for comparison with _TIF_NEED_RESCHED was failing even if NEED_RESCHED bit is set in the current thread_info flag. Due to this, preempt_schedule_irq() and in turn scheduler was not getting called even if highest priority task is ready for execution. So, store temporary results in r0 instead of r8 to prevent r8 from getting modified as subsequent code is dependent on its value. Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com> CC: <stable@vger.kernel.org> [v3.7+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-01 08:29:27 +10:00
Michael Neuling	a515348fc6	powerpc/pseries: Kill all prefetch streams on context switch On context switch, we should have no prefetch streams leak from one userspace process to another. This frees up prefetch resources for the next process. Based on patch from Milton Miller. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-06-01 08:29:25 +10:00

1 2 3 4 5 ...

3814 Commits