At console init, when the kernel tries to flush the log buffer
the ePAPR byte-channel based console write fails silently,
losing the buffered messages.
This happens because The ePAPR para-virtualization init isn't
done early enough so that the hcall instruction to be set,
causing the byte-channel write hcall to be a nop.
To fix, change the ePAPR para-virt init to use early device
tree functions and move it in early init.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
There are 6 counters in e6500 core instead of 4 in e500 core.
Signed-off-by: Lijun Pan <Lijun.Pan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Back in commit 89713ed "Add timer, performance monitor and machine check
counts to /proc/interrupts" we added a count of PMU interrupts to the
output of /proc/interrupts.
At the time we named them "CNT" to match x86.
However in commit 89ccf46 "Rename 'performance counter interrupt'", the
x86 guys renamed theirs from "CNT" to "PMI".
Arguably changing the name could break someone's script, but I think the
chance of that is minimal, and it's preferable to have a name that 1) is
somewhat meaningful, and 2) matches x86.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This problem belongs to the core synchronization issues.
The cpu1 already updated spin_table values, but bootcore cannot get
this value in time.
After bootcpu hibiernation restore the pages. we are now running
with the kernel data of the old kernel fully restored. if we reset
the non-bootcpus that will be reset cache(tlb), the non-bootcpus
will get new address(map virtual and physical address spaces).
but bootcpu tlb cache still use boot kernel data, so we need to
invalidate the bootcpu tlb cache make it to get new main memory data.
log:
Enabling non-boot CPUs ...
smp_85xx_kick_cpu: timeout waiting for core 1 to reset
smp: failed starting cpu 1 (rc -2)
Error taking CPU1 up: -2
Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Reviewed-by: Anton Vorontsov <anton@enomsg.org>
[scottwood@freescale.com: reworded code comment for clarity]
Signed-off-by: Scott Wood <scottwood@freescale.com>
A PCIe erratum of mpc85xx may causes a core hang when a link of PCIe
goes down. when the link goes down, Non-posted transactions issued
via the ATMU requiring completion result in an instruction stall.
At the same time a machine-check exception is generated to the core
to allow further processing by the handler. We implements the handler
which skips the instruction caused the stall.
This patch depends on patch:
powerpc/85xx: Add platform_device declaration to fsl_pci.h
Signed-off-by: Zhao Chenhui <b35336@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Liu Shuo <soniccat.liu@gmail.com>
Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
The patch introduces flag EEH_DEV_SYSFS to keep track that the sysfs
entries for the corresponding EEH device (then PCI device) has been
added or removed, in order to avoid race condition.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
While restoring BARs for one specific PCI device, the pci_dev
instance should have been released. So it's not reliable to use
the pci_dev instance on restoring BARs. However, we still need
some information (e.g. PCIe capability position, header type) from
the pci_dev instance. So we have to store those information to
EEH device in advance.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When EEH error happens to one specific PE, some devices with drivers
supporting EEH won't except hotplug on the device. However, there
might have other deivces without driver, or with driver without EEH
support. For the case, we need do partial hotplug in order to make
sure that the PE becomes absolutely quite during reset. Otherise,
the PE reset might fail and leads to failure of error recovery.
The current code doesn't handle that 'mixed' case properly, it either
uses the error callbacks to the drivers, or tries hotplug, but doesn't
handle a PE (EEH domain) composed of a combination of the two.
The patch intends to support so-called "partial" hotplug for EEH:
Before we do reset, we stop and remove those PCI devices without
EEH sensitive driver. The corresponding EEH devices are not detached
from its PE, but with special flag. After the reset is done, those
EEH devices with the special flag will be scanned one by one.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When EEH error happens to one specific PE, the device drivers
of its attached EEH devices (PCI devices) are checked to see
the further action: reset with complete hotplug, or reset without
hotplug. However, that's not enough for those PCI devices whose
drivers can't support EEH, or those PCI devices without driver.
So we need do so-called "partial hotplug" on basis of PCI devices.
In the situation, part of PCI devices of the specific PE are
unplugged and plugged again after PE reset.
The patch changes pcibios_add_pci_devices() so that it can support
full hotplug and so-called "partial" hotplug based on device-tree
or real hardware. It's notable that pci_of_scan.c has been changed
for a bit in order to support the "partial" hotplug based on dev-tree.
Most of the generic code already supports that, we just need to
plumb it properly on our side.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, we're trasversing the EEH devices list using list_for_each_entry().
That's not safe enough because the EEH devices might be removed from
its parent PE while doing iteration. The patch replaces that with
list_for_each_entry_safe().
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we do normal hotplug, the PE (shadow EEH structure) shouldn't be
kept around.
However, we need to keep it if the hotplug an artifial one caused by
EEH errors recovery.
Since we remove EEH device through the PCI hook pcibios_release_device(),
the flag "purge_pe" passed to various functions is meaningless. So the patch
removes the meaningless flag and introduce new flag "EEH_PE_KEEP"
to save the PE while doing hotplug during EEH error recovery.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make some functions public in order to support hotplug on either specific
PCI bus or PCI device in future.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We will rely on pcibios_release_device() to remove the EEH cache
and unbind EEH device for the specific PCI device. So we shouldn't
hold the reference to the PCI device from EEH cache and EEH device.
Otherwise, pcibios_release_device() won't be called as we expected.
The patch removes the reference to the PCI device in EEH core.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Module CRCs are implemented as absolute symbols that get resolved by
a linker script. We build an intermediate .o that contains an
unresolved symbol for each CRC. genksysms parses this .o, calculates
the CRCs and writes a linker script that "resolves" the symbols to
the calculated CRC.
Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols
that need relocating and relocates them at boot. Commit d4703aef
(module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y)
added a hook to reverse the bogus relocations. Part of this patch
created a symbol at 0x0:
# head -2 /proc/kallsyms
0000000000000000 T reloc_start
c000000000000000 T .__start
This reloc_start symbol is causing lots of confusion to perf. It
thinks reloc_start is a massive function that stretches from 0x0 to
0xc000000000000000 and we get various cryptic errors out of perf,
including:
problem incrementing symbol count, skipping event
This patch removes the reloc_start linker script label and instead
defines it as PHYSICAL_START. We also need to wrap it with
CONFIG_PPC64 because the ppc32 kernel can set a non zero
PHYSICAL_START at compile time and we wouldn't want to subtract
it from the CRCs in that case.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
POWER8 comes with two different PVRs. This patch enables the additional
PVR in the cputable.
The existing entry (PVR=0x4b) is renamed to POWER8E and the new entry
(PVR=0x4d) is given POWER8.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This reverts commit 07fa7a0a8a ("hw_breakpoints: Fix racy access to
ptrace breakpoints") and removes ptrace_get/put_breakpoints() added by
other commits.
The patch was fine but we can no longer race with SIGKILL after commit
9899d11f65 ("ptrace: ensure arch_ptrace/ptrace_request can never race
with SIGKILL"), the __TASK_TRACED tracee can't be woken up and
->ptrace_bps[] can't go away.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This branch contains the following changes:
- Removal of CONFIG_OF_DEVICE, it is always enabled by CONFIG_OF
- Remove #ifdef from linux/of_platform.h to increase compiler syntax
coverage
- Bug fix for address decoding on Bimini and js2x powerpc platforms.
- miscellaneous binding changes
One note on the above. The binding changes going in from all kinds of
different trees has gotten rather out of hand. I picked up some during
this cycle, but even going though my tree isn't a great fit. Ian
Campbell has prototyped splitting the bindings and .dtb files into a
separate repository. The plan is to migrate to using that sometime in
the next few kernel releases which should get rid of a lot of the churn
on binding docs and .dts files.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJR1fP3AAoJEEFnBt12D9kB3IIP/0Q5ctMespiJ50+ThjGsaR3m
sUbQkMK46uL/oupXaJT2ybX2PxLN5LpgvO9rPt77hblOoL0+wZt+j9G0pLy1qZQZ
aHprH9SrpGJv6F0SFbHp/+D/m9vESPv+zwYzL9TvrOALvCD7OSZ7tHLaoF7Y1ADM
QnZa7pta3Owpu5NsGXaTXLpaZzfXzfWzf4PDzv2FsAIDbtuVJZGJZ7sJVO7Z0r+K
KCY85uKJ4VOHY0onBVlM6uoCnopOi2XMMkyxYvR28lL2Kiv2b3np46jG3zX1EZH5
Qxdu85QZn2oio9iaTeYKK8bG9aRIRsXnzCnF2s68n2rQlEtPpWKN9Lj2AS/KJ+Ig
obFTOFDHmxt1F4GIA0/HIPkDvRd7GTIwgwYYubEMi44E3Mae0N+xzkIRE41vYP7s
8zaNHbjAjsYjplsvN5gTPxxiU/ta24a5bl7Ont2zmOjAbXCsDajm4NCKZRJ3lb2f
FHNsS1zHGmqgJ9zt13GQabo/Tp4t3KwTzBirPQsDokRO4eoL6klcS3GCRv82VWC0
dLnzu92hXcyXgh7mX2sj6sRBSwNygxMn4ZsZJklle38/LynvtrzT72BOZjghS+Vh
l553uDInjSJ3IBrXnClPoyObcu50cmsBBgsK39FzU+MF9mcCHmkHQiT52zM6ZW3M
wwY1OfcZk3XaT7akcVu6
=CndB
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull device tree updates from Grant Likely:
"This branch contains the following changes:
- Removal of CONFIG_OF_DEVICE, it is always enabled by CONFIG_OF
- Remove #ifdef from linux/of_platform.h to increase compiler syntax
coverage
- Bug fix for address decoding on Bimini and js2x powerpc platforms.
- miscellaneous binding changes
One note on the above. The binding changes going in from all kinds of
different trees has gotten rather out of hand. I picked up some
during this cycle, but even going though my tree isn't a great fit.
Ian Campbell has prototyped splitting the bindings and .dtb files into
a separate repository. The plan is to migrate to using that sometime
in the next few kernel releases which should get rid of a lot of the
churn on binding docs and .dts files"
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
of: Fix address decoding on Bimini and js2x machines
of: remove CONFIG_OF_DEVICE
usb: chipidea: depend on CONFIG_OF instead of CONFIG_OF_DEVICE
of: remove of_platform_driver
ibmebus: convert of_platform_driver to platform_driver
driver core: move to_platform_driver to platform_device.h
mfd: DT bindings for the palmas family MFD
ARM: dts: omap3-devkit8000: fix NAND memory binding
of/base: fix typos
of: remove #ifdef from linux/of_platform.h
Pull powerpc updates from Ben Herrenschmidt:
"This is the powerpc changes for the 3.11 merge window. In addition to
the usual bug fixes and small updates, the main highlights are:
- Support for transparent huge pages by Aneesh Kumar for 64-bit
server processors. This allows the use of 16M pages as transparent
huge pages on kernels compiled with a 64K base page size.
- Base VFIO support for KVM on power by Alexey Kardashevskiy
- Wiring up of our nvram to the pstore infrastructure, including
putting compressed oopses in there by Aruna Balakrishnaiah
- Move, rework and improve our "EEH" (basically PCI error handling
and recovery) infrastructure. It is no longer specific to pseries
but is now usable by the new "powernv" platform as well (no
hypervisor) by Gavin Shan.
- I fixed some bugs in our math-emu instruction decoding and made it
usable to emulate some optional FP instructions on processors with
hard FP that lack them (such as fsqrt on Freescale embedded
processors).
- Support for Power8 "Event Based Branch" facility by Michael
Ellerman. This facility allows what is basically "userspace
interrupts" for performance monitor events.
- A bunch of Transactional Memory vs. Signals bug fixes and HW
breakpoint/watchpoint fixes by Michael Neuling.
And more ... I appologize in advance if I've failed to highlight
something that somebody deemed worth it."
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits)
pstore: Add hsize argument in write_buf call of pstore_ftrace_call
powerpc/fsl: add MPIC timer wakeup support
powerpc/mpic: create mpic subsystem object
powerpc/mpic: add global timer support
powerpc/mpic: add irq_set_wake support
powerpc/85xx: enable coreint for all the 64bit boards
powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx
powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig
powerpc/mpic: Add get_version API both for internal and external use
powerpc: Handle both new style and old style reserve maps
powerpc/hw_brk: Fix off by one error when validating DAWR region end
powerpc/pseries: Support compression of oops text via pstore
powerpc/pseries: Re-organise the oops compression code
pstore: Pass header size in the pstore write callback
powerpc/powernv: Fix iommu initialization again
powerpc/pseries: Inform the hypervisor we are using EBB regs
powerpc/perf: Add power8 EBB support
powerpc/perf: Core EBB support for 64-bit book3s
powerpc/perf: Drop MMCRA from thread_struct
powerpc/perf: Don't enable if we have zero events
...
Merge first patch-bomb from Andrew Morton:
- various misc bits
- I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been
distracted. There has been quite a bit of activity.
- About half the MM queue
- Some backlight bits
- Various lib/ updates
- checkpatch updates
- zillions more little rtc patches
- ptrace
- signals
- exec
- procfs
- rapidio
- nbd
- aoe
- pps
- memstick
- tools/testing/selftests updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (445 commits)
tools/testing/selftests: don't assume the x bit is set on scripts
selftests: add .gitignore for kcmp
selftests: fix clean target in kcmp Makefile
selftests: add .gitignore for vm
selftests: add hugetlbfstest
self-test: fix make clean
selftests: exit 1 on failure
kernel/resource.c: remove the unneeded assignment in function __find_resource
aio: fix wrong comment in aio_complete()
drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode
drivers/memstick/host/r592.c: convert to module_pci_driver
drivers/memstick/host/jmb38x_ms: convert to module_pci_driver
pps-gpio: add device-tree binding and support
drivers/pps/clients/pps-gpio.c: convert to module_platform_driver
drivers/pps/clients/pps-gpio.c: convert to devm_* helpers
drivers/parport/share.c: use kzalloc
Documentation/accounting/getdelays.c: avoid strncpy in accounting tool
aoe: update internal version number to v83
aoe: update copyright date
aoe: perform I/O completions in parallel
...
PCI device hotplug
- Add pci_alloc_dev() interface (Gu Zheng)
- Add pci_bus_get()/put() for reference counting (Jiang Liu)
- Fix SR-IOV reference count issues (Jiang Liu)
- Remove unused acpi_pci_roots list (Jiang Liu)
MSI
- Conserve interrupt resources on x86 (Alexander Gordeev)
AER
- Force fatal severity when component has been reset (Betty Dall)
- Reset link below Root Port as well as Downstream Port (Betty Dall)
- Fix "Firmware first" flag setting (Bjorn Helgaas)
- Don't parse HEST for non-PCIe devices (Bjorn Helgaas)
ASPM
- Warn when we can't disable ASPM as driver requests (Bjorn Helgaas)
Miscellaneous
- Add CircuitCo PCI IDs (Darren Hart)
- Add AMD CZ SATA and SMBus PCI IDs (Shane Huang)
- Work around Ivytown NTB BAR size issue (Jon Mason)
- Detect invalid initial BAR values (Kevin Hao)
- Add pcibios_release_device() (Sebastian Ott)
- Fix powerpc & sparc PCI_UNKNOWN power state usage (Bjorn Helgaas)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJR0dhDAAoJEFmIoMA60/r8AUoP/RrKOXC2GGZCqUIKjUyxaCU+
NXwaNhjfRuge5lZHE8fUAnYVveFTv0iNo8if/md866/pS4il3vxaMWRhZrBddXqe
juyxPUGaOb5NmI2C+g5ebQ1xHhnOU6kWrgQ5kQk5GmJdm6BpiWDCFaalyioYj27v
FoN/25IG+5EtJjP6kRdQGFZq+RYOqlBfQp4fdFmY5bDsQiCLREH6YWHeNSkH+t1I
Eh84WESqGzgaZyCb9QKM2AcU/HMKLux4VXAYp9idVr3tH1j9b/klQI7xNW7sPnkY
LIzKTcfF89iXkhxc7zrF0O/n5rC5Cp7LQpiFMV6yCT3w25yWpq9itOwqcZ/nfCv6
fje8P1B2lwGrizkwKKLcosTzWkJewvfLkVye90WS3g0i3zlijF4pfEiw3a2ujA91
MP9/JmX+ZZ5QeGyPuFmYJyMlInH4vtSdegl9jtaeuX4cOnuMP+Ouxnxc+mH2bOfl
Z5/K1OSCYLfb27uWM7od2lgb+GFHLMP+RMy073h0ZMpDvM6EnZy5iu1zU9+yJO4S
8/aRhBz4h+YEBinnXOJvHzMfu3wQQ7UvXZqEspgsug2Z5xHvxhMLhrJQgpVUSdsW
Nrdm1dNdACV/cvt/lWzUE7SmUaOua/r/cVmViF2ryeRET7t65in+5NHXmXP8ET+r
1WA7pbykfegC9uY84PaK
=X3Lo
-----END PGP SIGNATURE-----
Merge tag 'pci-v3.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI changes from Bjorn Helgaas:
"PCI device hotplug
- Add pci_alloc_dev() interface (Gu Zheng)
- Add pci_bus_get()/put() for reference counting (Jiang Liu)
- Fix SR-IOV reference count issues (Jiang Liu)
- Remove unused acpi_pci_roots list (Jiang Liu)
MSI
- Conserve interrupt resources on x86 (Alexander Gordeev)
AER
- Force fatal severity when component has been reset (Betty Dall)
- Reset link below Root Port as well as Downstream Port (Betty Dall)
- Fix "Firmware first" flag setting (Bjorn Helgaas)
- Don't parse HEST for non-PCIe devices (Bjorn Helgaas)
ASPM
- Warn when we can't disable ASPM as driver requests (Bjorn Helgaas)
Miscellaneous
- Add CircuitCo PCI IDs (Darren Hart)
- Add AMD CZ SATA and SMBus PCI IDs (Shane Huang)
- Work around Ivytown NTB BAR size issue (Jon Mason)
- Detect invalid initial BAR values (Kevin Hao)
- Add pcibios_release_device() (Sebastian Ott)
- Fix powerpc & sparc PCI_UNKNOWN power state usage (Bjorn Helgaas)"
* tag 'pci-v3.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (51 commits)
MAINTAINERS: Add ACPI folks for ACPI-related things under drivers/pci
PCI: Add CircuitCo vendor ID and subsystem ID
PCI: Use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM)
PCI: Return early on allocation failures to unindent mainline code
PCI: Simplify IOV implementation and fix reference count races
PCI: Drop redundant setting of bus->is_added in virtfn_add_bus()
unicore32/PCI: Remove redundant call of pci_bus_add_devices()
m68k/PCI: Remove redundant call of pci_bus_add_devices()
PCI / ACPI / PM: Use correct power state strings in messages
PCI: Fix comment typo for pcie_pme_remove()
PCI: Rename pci_release_bus_bridge_dev() to pci_release_host_bridge_dev()
PCI: Fix refcount issue in pci_create_root_bus() error recovery path
ia64/PCI: Clean up pci_scan_root_bus() usage
PCI/AER: Reset link for devices below Root Port or Downstream Port
ACPI / APEI: Force fatal AER severity when component has been reset
PCI/AER: Remove "extern" from function declarations
PCI/AER: Move AER severity defines to aer.h
PCI/AER: Set dev->__aer_firmware_first only for matching devices
PCI/AER: Factor out HEST device type matching
PCI/AER: Don't parse HEST table for non-PCIe devices
...
saved_max_pfn is used to know the amount of memory that the previous
kernel used. And for powerpc, we set saved_max_pfn by passing the kernel
commandline parameter "savemaxmem=".
The only user of saved_max_pfn in powerpc is read_oldmem interface. Since
we have removed read_oldmem, we don't need this parameter anymore.
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Address more review comments from last round of code review.
1) Enhance free_reserved_area() to support poisoning freed memory with
pattern '0'. This could be used to get rid of poison_init_mem()
on ARM64.
2) A previous patch has disabled memory poison for initmem on s390
by mistake, so restore to the original behavior.
3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change signature of free_reserved_area() according to Russell King's
suggestion to fix following build warnings:
arch/arm/mm/init.c: In function 'mem_init':
arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
^
In file included from include/linux/mman.h:4:0,
from arch/arm/mm/init.c:15:
include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
In file included from arch/mips/include/asm/page.h:49:0,
from include/linux/mmzone.h:20,
from include/linux/gfp.h:4,
from include/linux/mm.h:8,
from mm/page_alloc.c:18:
arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
mm/page_alloc.c: In function 'free_area_init_nodes':
mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
Also address some minor code review comments.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull second set of VFS changes from Al Viro:
"Assorted f_pos race fixes, making do_splice_direct() safe to call with
i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
->d_hash/->d_compare calling conventions changes from Linus, misc
stuff all over the place."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
Document ->tmpfile()
ext4: ->tmpfile() support
vfs: export lseek_execute() to modules
lseek_execute() doesn't need an inode passed to it
block_dev: switch to fixed_size_llseek()
cpqphp_sysfs: switch to fixed_size_llseek()
tile-srom: switch to fixed_size_llseek()
proc_powerpc: switch to fixed_size_llseek()
ubi/cdev: switch to fixed_size_llseek()
pci/proc: switch to fixed_size_llseek()
isapnp: switch to fixed_size_llseek()
lpfc: switch to fixed_size_llseek()
locks: give the blocked_hash its own spinlock
locks: add a new "lm_owner_key" lock operation
locks: turn the blocked_list into a hashtable
locks: convert fl_link to a hlist_node
locks: avoid taking global lock if possible when waking up blocked waiters
locks: protect most of the file_lock handling with i_lock
locks: encapsulate the fl_link list handling
locks: make "added" in __posix_lock_file a bool
...
When Jeremy introduced the new device-tree based reserve map, he made
the code in early_reserve_mem_dt() bail out if it found one, thus not
reserving the initrd nor processing the old style map.
I hit problems with variants of kexec that didn't put the initrd in
the new style map either. While these could/will be fixed, I believe
we should be safe here and rather reserve more than not enough.
We could have a firmware passing stuff via the new style map, and
in the middle, a kexec that knows nothing about it and adding other
things to the old style map.
I don't see a big issue with processing both and reserving everything
that needs to be. memblock_reserve() supports overlaps fine these days.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The Data Address Watchpoint Register (DAWR) on POWER8 can take a 512
byte range but this range must not cross a 512 byte boundary.
Unfortunately we were off by one when calculating the end of the region,
hence we were not allowing some breakpoint regions which were actually
valid. This fixes this error.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL
Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO
Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9
9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS
bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10
U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ=
=Ob6Z
-----END PGP SIGNATURE-----
Merge tag 'v3.10' into next
Merge 3.10 in order to get some of the last minute powerpc
changes, resolve conflicts and add additional fixes on top
of them.
Add support for EBB (Event Based Branches) on 64-bit book3s. See the
included documentation for more details.
EBBs are a feature which allows the hardware to branch directly to a
specified user space address when a PMU event overflows. This can be
used by programs for self-monitoring with no kernel involvement in the
inner loop.
Most of the logic is in the generic book3s code, primarily to avoid a
proliferation of PMU callbacks.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In commit 59affcd "Context switch more PMU related SPRs" I added more
PMU SPRs to thread_struct, later modified in commit b11ae95. To add
insult to injury it turns out we don't need to switch MMCRA as it's
only user readable, and the value is recomputed by the PMU code.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Similar to the facility unavailble exception, except the facilities are
controlled by HFSCR.
Adapt the facility_unavailable_exception() so it can be called for
either the regular or Hypervisor facility unavailable exceptions.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The exception at 0xf60 is not the TM (Transactional Memory) unavailable
exception, it is the "Facility Unavailable Exception", rename it as
such.
Flesh out the handler to acknowledge the fact that it can be called for
many reasons, one of which is TM being unavailable.
Use STD_EXCEPTION_COMMON() for the exception body, for some reason we
had it open-coded, I've checked the generated code is identical.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.10]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
KVMTEST is a macro which checks whether we are taking an exception from
guest context, if so we branch out of line and eventually call into the
KVM code to handle the switch.
When running real guests on bare metal (HV KVM) the hardware ensures
that we never take a relocation on exception when transitioning from
guest to host. For PR KVM we disable relocation on exceptions ourself in
kvmppc_core_init_vm(), as of commit a413f47 "Disable relocation on
exceptions whenever PR KVM is active".
So convert all the RELON macros to use NOTEST, and drop the remaining
KVM_HANDLER() definitions we have for 0xe40 and 0xe80.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.9+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have relocation on exception handlers defined for h_data_storage and
h_instr_storage. However we will never take relocation on exceptions for
these because they can only come from a guest, and we never take
relocation on exceptions when we transition from guest to host.
We also have a handler for hmi_exception (Hypervisor Maintenance) which
is defined in the architecture to never be delivered with relocation on,
see see v2.07 Book III-S section 6.5.
So remove the handlers, leaving a branch to self just to be double extra
paranoid.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
CC: <stable@vger.kernel.org> [v3.9+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
the smp_release_cpus is a normal funciton and called in normal environments,
but it calls the __initdata spinning_secondaries.
need modify spinning_secondaries to match smp_release_cpus.
the related warning:
(the linker report boot_paca.33377, but it should be spinning_secondaries)
-----------------------------------------------------------------------------
WARNING: arch/powerpc/kernel/built-in.o(.text+0x23176): Section mismatch in reference from the function .smp_release_cpus() to the variable .init.data:boot_paca.33377
The function .smp_release_cpus() references
the variable __initdata boot_paca.33377.
This is often because .smp_release_cpus lacks a __initdata
annotation or the annotation of boot_paca.33377 is wrong.
WARNING: arch/powerpc/kernel/built-in.o(.text+0x231fe): Section mismatch in reference from the function .smp_release_cpus() to the variable .init.data:boot_paca.33377
The function .smp_release_cpus() references
the variable __initdata boot_paca.33377.
This is often because .smp_release_cpus lacks a __initdata
annotation or the annotation of boot_paca.33377 is wrong.
-----------------------------------------------------------------------------
Signed-off-by: Chen Gang <gang.chen@asianux.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When error occurs, need return the related error code to let upper
caller know about it.
ppc_md.nvram_size() can return the error code (e.g. core99_nvram_size()
in 'arch/powerpc/platforms/powermac/nvram.c').
Also set ret value when only need it, so can save structions for normal
cases.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
This removes all the powerpc uses of the __cpuinit macros. There
are no __CPUINIT users in assembly files in powerpc.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This typedef is unnecessary and should just be removed.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For an unknown relocation type since the value of r4 is just the 8bit
relocation type, the sum of r4 and r7 may yield an invalid memory
address. For example:
In normal case:
r4 = c00xxxxx
r7 = 40000000
r4 + r7 = 000xxxxx
For an unknown relocation type:
r4 = 000000xx
r7 = 40000000
r4 + r7 = 400000xx
400000xx is an invalid memory address for a board which has just
512M memory.
And for operations such as dcbst or icbi may cause bus error for an
invalid memory address on some platforms and then cause the board
reset. So we should skip the flush/invalidate the d/icache for
an unknown relocation type.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Acked-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch is for avoiding following build warnings:
The function .pnv_pci_ioda_fixup() references
the function __init .eeh_init().
This is often because .pnv_pci_ioda_fixup lacks a __init
The function .pnv_pci_ioda_fixup() references
the function __init .eeh_addr_cache_build().
This is often because .pnv_pci_ioda_fixup lacks a __init
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We needn't the the whole backtrace other than one-line message in
the error reporting interrupt handler. For errors triggered by
access PCI config space or MMIO, we replace "WARN(1, ...)" with
pr_err() and dump_stack(). The patch also adds more output messages
to indicate what EEH core is doing. Besides, some printk() are
replaced with pr_warning().
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On the PowerNV platform, the EEH address cache isn't built correctly
because we skipped the EEH devices without binding PE. The patch
fixes that.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
After reset (e.g. complete reset) in order to bring the fenced PHB
back, the PCIe link might not be ready yet. The patch intends to
make sure the PCIe link is ready before accessing its subordinate
PCI devices. The patch also fixes that wrong values restored to
PCI_COMMAND register for PCI bridges.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When the PHB is fenced or dead, it's pointless to collect the data
from PCI config space of subordinate PCI devices since it should
return 0xFF's. The patch also fixes overwritten buffer while getting
PCI config data.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we treclaim and trecheckpoint there's an unavoidable period when r1
will not be a valid kernel stack pointer.
This patch clears the MSR recoverable interrupt (RI) bit over these
regions to indicate we have an invalid kernel stack pointer.
For treclaim, the region over which we clear MSR RI is larger than
required to avoid the need for an extra costly mtmsrd.
Thanks to Paulus for suggesting this change.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
String instruction emulation would erroneously result in a segfault if
the upper bits of the EA are set and is so high that it fails access
check. Truncate the EA to 32 bits if the process is 32-bit.
Signed-off-by: James Yang <James.Yang@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 37f02195b (powerpc/pci: fix PCI-e devices rescan issue on powerpc
platform) fixes a problem with interrupt and DMA initialization on hot
plugged devices. With this commit, interrupt and DMA initialization for
hot plugged devices is handled in the pci device enable function.
This approach has a couple of drawbacks. First, it creates two code paths
for device initialization, one for hot plugged devices and another for devices
known during the initial PCI scan. Second, the initialization code for hot
plugged devices is only called when the device is enabled, ie typically
in the probe function. Also, the platform specific setup code is called each
time pci_enable_device() is called, not only once during device discovery,
meaning it is actually called multiple times, once for devices discovered
during the initial scan and again each time a driver is re-loaded.
The visible result is that interrupt pins are only assigned to hot plugged
devices when the device driver is loaded. Effectively this changes the PCI
probe API, since pci_dev->irq and the device's dma configuration will now
only be valid after pci_enable() was called at least once. A more subtle
change is that platform specific PCI device setup is moved from device
discovery into the driver's probe function, more specifically into the
pci_enable_device() call.
To fix the inconsistencies, add new function pcibios_add_device.
Call pcibios_setup_device from pcibios_setup_bus_devices if device setup
is not complete, and from pcibios_add_device if bus setup is complete.
With this change, device setup code is moved back into device initialization,
and called exactly once for both static and hot plugged devices.
[ This also fixes a regression introduced by the above patch which
causes dev->irq to be overwritten under some cirumstances after
MSIs have been enabled for the device which leads to crashes due
to the MSI core "hijacking" dev->irq to store the base MSI number
and not the LSI. --BenH
]
Cc: Yuanquan Chen <Yuanquan.Chen@freescale.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To replace down() with down_interrutible() to avoid following
warning:
[c00000007ba7b710] [c000000000014410] .__switch_to+0x1b0/0x380
[c00000007ba7b7c0] [c0000000007b408c] .__schedule+0x3ec/0x970
[c00000007ba7ba50] [c0000000007b1f24] .schedule_timeout+0x1a4/0x2b0
[c00000007ba7bb30] [c0000000007b34a4] .__down+0xa4/0x104
[c00000007ba7bbf0] [c0000000000b9230] .down+0x60/0x70
[c00000007ba7bc80] [c0000000000336d0] .eeh_event_handler+0x70/0x190
[c00000007ba7bd30] [c0000000000b1a58] .kthread+0xe8/0xf0
[c00000007ba7be30] [c00000000000a05c] .ret_from_kernel_thread+0x5c/0x8
This also avoids keeping the load average up while doing nothing.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Originally, eeh_mutex was introduced to protect the PE hierarchy
tree and the attached EEH devices because EEH core was possiblly
running with multiple threads to access the PE hierarchy tree.
However, we now have only one kthread in EEH core. So we needn't
the eeh_mutex and just remove it.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In 9422de3 "powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint
registers" we changed the way we mark extraneous irqs with this:
- info->extraneous_interrupt = !((bp->attr.bp_addr <= dar) &&
- (dar - bp->attr.bp_addr < bp->attr.bp_len));
+ if (!((bp->attr.bp_addr <= dar) &&
+ (dar - bp->attr.bp_addr < bp->attr.bp_len)))
+ info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
Unfortunately this is bogus as it never clears extraneous IRQ if it's already
set.
This correctly clears extraneous IRQ before possibly setting it.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The smallest match region for both the DABR and DAWR is 8 bytes, so the
kernel needs to filter matches when users want to look at regions smaller than
this.
Currently we set the length of PPC_BREAKPOINT_MODE_EXACT breakpoints to 8.
This is wrong as in exact mode we should only match on 1 address, hence the
length should be 1.
This ensures that the kernel will filter out any exact mode hardware breakpoint
matches on any addresses other than the requested one.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Replace find_linux_pte with find_linux_pte_or_hugepte and explicitly
document why we don't need to handle transparent hugepages at callsites.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It's meaningless to handle frozen PE if we already had fenced PHB.
The patch intends to check the PHB state before checking PE. If the
PHB has been put into fenced state, we need take care of that firstly.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On PowerNV platform, the EEH event caused by interrupt won't have
binding PE. The patch enables EEH core to handle the special event.
To avoid the current logic we have, The eeh_handle_event() is renamed
to eeh_handle_normal_event(), and the eeh_handle_special_event() is
introduced. The function eeh_handle_event() dispatches to above two
functions according to the input parameter. Besides, new backend
"next_error" added to eeh_ops and it's expected to have following
return values:
4 - Dead IOC 3 - Dead PHB
2 - Fenced PHB 1 - Frozen PE
0 - No error found
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
An EEH event is created and queued to the event queue for each
ingress EEH error. When there're mutiple EEH errors, we need serialize
the process to keep consistent PE state (flags). The spinlock
"confirm_error_lock" was introduced for the purpose. We'll inject
EEH event upon error reporting interrupts on PowerNV platform. So
we export the spinlock for that to use for consistent PE state.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On PowerNV platform, we might run into the situation where subsequent
events are duplicated events of former one, which is being processed.
For the case, we need the function implemented by the patch to purge
EEH events accordingly.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We're not expecting that one specific PE got frozen for over 5
times in last hour. Otherwise, the PE will be removed from the
system upon newly coming EEH errors. The patch introduces time
stamp to trace the first error on specific PE in last hour and
function to update that accordingly. Besides, the time stamp
is recovered during PE hotplug path as we did for frozen count.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We possiblly have multiple kthreads running for multiple EEH errors
(events) and use one spinlock to make the process of handling those
EEH events serialized. That's unnecessary and the patch creates only
one kthread, which is started during EEH core initialization time in
eeh_init(). A new semaphore introduced to count the number of existing
EEH events in the queue and the kthread waiting on the semaphore.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
While doing EEH recovery, the PCI devices of the problematic PE
should be removed and then added to the system again. During the
so-called hotplug event, the PCI devices of the problematic PE
will be probed through early/late phase. We would delay EEH probe
on late point for PowerNV platform since the PCI device isn't
available in early phase.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We shouldn't check that the returned PE status is exactly equal to
(EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE) but instead only check
that they are both set.
[benh: changelog]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch adds new EEH operation post_init. It's used to notify
the platform that EEH core has completed the EEH probe. By that,
PowerNV platform starts to use the services supplied by EEH
functionality.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For EEH on PowerNV platform, we will do EEH probe based on the
real PCI devices. The PCI devices are available after PCI probe.
So we have to call eeh_init() explicitly on PowerNV platform
after PCI probe. The patch also does EEH probe for PowerNV platform
in eeh_init().
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There're several types of PEs can be supported for now: PHB, Bus
and Device dependent PE. For PCI bus dependent PE, tracing the
corresponding PCI bus from PE (struct eeh_pe) would make the code
more efficient. The patch also enables the retrieval of PCI bus based
on the PCI bus dependent PE.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
While processing EEH event interrupt from P7IOC, we need function
to retrieve the PE according to the indicated EEH device. The patch
makes function eeh_pe_get() public so that other source files can call
it for that purpose. Also, the patch fixes referring to wrong BDF
(Bus/Device/Function) address while searching PE in function
__eeh_pe_get().
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
One of the possible cases indicated by P7IOC interrupt is fenced
PHB. For that case, we need fetch the PE corresponding to the PHB
and disable the PHB and all subordinate PCI buses/devices, recover
from the fenced state and eventually enable the whole PHB. We need
one function to fetch the PHB PE outside eeh_pe.c and the patch is
going to make eeh_phb_pe_get() public for that purpose.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch moves the common part of EEH core into arch/powerpc/kernel
directory so that we needn't PPC_PSERIES while compiling POWERNV
platform:
* Move the EEH common part into arch/powerpc/kernel
* Move the functions for PCI hotplug from pSeries platform to
arch/powerpc/kernel/pci-hotplug.c
* Move CONFIG_EEH from arch/powerpc/platforms/pseries/Kconfig to
arch/powerpc/platforms/Kconfig
* Adjust makefile accordingly
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently we only restore signals which are transactionally suspended but it's
possible that the transaction can be restored even when it's active. Most
likely this will result in a transactional rollback by the hardware as the
transaction will have been doomed by an earlier treclaim.
The current code is a legacy of earlier kernel implementations which did
software rollback of active transactions in the kernel. That code has now gone
but we didn't correctly fix up this part of the signals code which still makes
assumptions based on having software rollback.
This changes the signal return code to always restore both contexts on 64 bit
signal return. It also ensures that the MSR TM bits are properly restored from
the signal context which they are not currently.
Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently we only restore signals which are transactionally suspended but it's
possible that the transaction can be restored even when it's active. Most
likely this will result in a transactional rollback by the hardware as the
transaction will have been doomed by an earlier treclaim.
The current code is a legacy of earlier kernel implementations which did
software rollback of active transactions in the kernel. That code has now gone
but we didn't correctly fix up this part of the signals code which still makes
assumptions based on having software rollback.
This changes the signal return code to always restore both contexts on 32 bit
rt signal return.
Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently we clear out the MSR TM bits on signal return assuming that the
signal should never return to an active transaction.
This is bogus as the user may do this. It's most likely the transaction will
be doomed due to a treclaim but that's a problem for the HW not the kernel.
The current code is a legacy of earlier kernel implementations which did
software rollback of active transactions in the kernel. That code has now gone
but we didn't correctly fix up this part of the signals code which still makes
the assumption that it must be returning to a suspended transaction.
This pulls out both MSR TM bits from the user supplied context rather than just
setting TM suspend. We pull out only the bits needed to ensure the user can't
do anything dangerous to the MSR.
Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently sys_sigreturn() is TM unaware. Therefore, if we take a 32 bit signal
without SIGINFO (non RT) inside a transaction, on signal return we don't
restore the signal frame correctly.
This checks if the signal frame being restoring is an active transaction, and
if so, it copies the additional state to ptregs so it can be restored.
Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The MSR TM controls are in the top 32 bits of the MSR hence on 32 bit signals,
we stick the top half of the MSR in the checkpointed signal context so that the
user can access it.
Unfortunately, we don't currently write anything to the checkpointed signal
context when coming in a from a non transactional process and hence the top MSR
bits can contain junk.
This updates the 32 bit signal handling code to always write something to the
top MSR bits so that users know if the process is transactional or not and the
kernel can use it on signal return.
Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is duplicated code from math-emu and implements such a small
subset of the FPU (load/stores/fmr) that it's essentially pointless
nowdays.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
(Including 64-bit ones)
This allow SW emulation by the kernel of optional instructions
such as fsqrt which aren't implemented on some processors, and
thus fixes some Fedora 19 issues such as Anaconda since the
compiler is set to generate those by default on 64-bit.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On BookE (Branch taken + Single Step) is as same as Branch Taken
on BookS and in Linux we simulate BookS behavior for BookE as well.
When doing so, in Branch taken handling we want to set DBCR0_IC but
we update the current->thread->dbcr0 and not DBCR0.
Now on 64bit the current->thread.dbcr0 (and other debug registers)
is synchronized ONLY on context switch flow. But after handling
Branch taken in debug exception if we return back to user space
without context switch then single stepping change (DBCR0_ICMP)
does not get written in h/w DBCR0 and Instruction Complete exception
does not happen.
This fixes using ptrace reliably on BookE-PowerPC
lmbench latency test (lat_syscall) Results are (they varies a little
on each run)
1) ./lat_syscall <action> /dev/shm/uImage
action: Open read write stat fstat null
Before: 3.8618 0.2017 0.2851 1.6789 0.2256 0.0856
After: 3.8580 0.2017 0.2851 1.6955 0.2255 0.0856
1) ./lat_syscall -P 2 -N 10 <action> /dev/shm/uImage
action: Open read write stat fstat null
Before: 4.1388 0.2238 0.3066 1.7106 0.2256 0.0856
After: 4.1413 0.2236 0.3062 1.7107 0.2256 0.0856
[ Slightly modified to avoid extra branch in the fast path
on Book3S and fix build on all non-BookE 64-bit -- BenH
]
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This initializes IOMMU groups based on the IOMMU configuration
discovered during the PCI scan on POWERNV (POWER non virtualized)
platform. The IOMMU groups are to be used later by the VFIO driver,
which is used for PCI pass through.
It also implements an API for mapping/unmapping pages for
guest PCI drivers and providing DMA window properties.
This API is going to be used later by QEMU-VFIO to handle
h_put_tce hypercalls from the KVM guest.
The iommu_put_tce_user_mode() does only a single page mapping
as an API for adding many mappings at once is going to be
added later.
Although this driver has been tested only on the POWERNV
platform, it should work on any platform which supports
TCE tables. As h_put_tce hypercall is received by the host
kernel and processed by the QEMU (what involves calling
the host kernel again), performance is not the best -
circa 220MB/s on 10Gb ethernet network.
To enable VFIO on POWER, enable SPAPR_TCE_IOMMU config
option and configure VFIO as required.
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Based on benh's proposal at
https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-September/101237.html,
this change provides support for reserving memory from the
reserved-ranges node at the root of the device tree.
We just call memblock_reserve on these ranges for now.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Previously in order to handle the edge sensitive decrementers,
we choose to set the decrementer to 1 to trigger a decrementer
interrupt when re-enabling interrupts. But with the rework of the
lazy EE, we would replay the decrementer interrupt when re-enabling
interrupts if a decrementer interrupt occurs with irq soft-disabled.
So there is no need to trigger a decrementer interrupt in this case
any more.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch moves the single step enable code used by kprobe to a generic
routine header so that, it can be re-used by other code, in this case,
uprobes. No functional changes.
Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc: Ananth N Mavinakaynahalli <ananth@in.ibm.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
External/Decrement exceptions have lower priority than the Debug Exception.
So, we don't have to disable the External interrupts before a single step.
However, on BookE, Critical Input Exception(CE) has higher priority than a
Debug Exception. Hence we mask them.
Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Ananth N Mavinakaynahalli <ananth@in.ibm.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When replaying interrupts (as a result of the interrupt occurring
while soft-disabled), in the case of the decrementer, we are exclusively
testing for a pending timer target. However we also use decrementer
interrupts to trigger the new "irq_work", which in this case would
be missed.
This change the logic to force a replay in both cases of a timer
boundary reached and a decrementer interrupt having actually occurred
while disabled. The former test is still useful to catch cases where
a CPU having been hard-disabled for a long time completely misses the
interrupt due to a decrementer rollover.
CC: <stable@vger.kernel.org> [v3.4+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Normally, the kernel emulates a few instructions that are unimplemented
on some processors (e.g. the old dcba instruction), or privileged (e.g.
mfpvr). The emulation of unimplemented instructions is currently not
working on the PowerNV platform. The reason is that on these machines,
unimplemented and illegal instructions cause a hypervisor emulation
assist interrupt, rather than a program interrupt as on older CPUs.
Our vector for the emulation assist interrupt just calls
program_check_exception() directly, without setting the bit in SRR1
that indicates an illegal instruction interrupt. This fixes it by
making the emulation assist interrupt set that bit before calling
program_check_interrupt(). With this, old programs that use no-longer
implemented instructions such as dcba now work again.
CC: <stable@vger.kernel.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It's possible for us to crash when running with ftrace enabled, eg:
Bad kernel stack pointer bffffd12 at c00000000000a454
cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
pc: c00000000000a454: resume_kernel+0x34/0x60
lr: c00000000000335c: performance_monitor_common+0x15c/0x180
sp: bffffd12
msr: 8000000000001032
dar: bffffd12
dsisr: 42000000
If we look at current's stack (paca->__current->stack) we see it is
equal to c0000002ecab0000. Our stack is 16K, and comparing to
paca->kstack (c0000002ecab3e30) we can see that we have overflowed our
kernel stack. This leads to us writing over our struct thread_info, and
in this case we have corrupted thread_info->flags and set
_TIF_EMULATE_STACK_STORE.
Dumping the stack we see:
3:mon> t c0000002ecab0000
[c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
[c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
--- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
[c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
[c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
[c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
[c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
[c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
[c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
[c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
--- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
[c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
[c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
[c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
[c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
[c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
[c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
--- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
... and so on
__ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
path. At that point the irq state is not consistent, ie. interrupts are
hard disabled (by the exception entry), but the paca soft-enabled flag
may be out of sync.
This leads to the local_irq_restore() in trace_graph_entry() actually
enabling interrupts, which we do not want. Because we have not yet
reprogrammed the decrementer we immediately take another decrementer
exception, and recurse.
The fix is twofold. Firstly make sure we call DISABLE_INTS before
calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
the irq state in the paca with the hardware, making it safe again to
call local_irq_save/restore().
Although that should be sufficient to fix the bug, we also mark the
runlatch routines as notrace. They are called very early in the
exception entry and we are asking for trouble tracing them. They are
also fairly uninteresting and tracing them just adds unnecessary
overhead.
[ This regression was introduced by fe1952fc0a
"powerpc: Rework runlatch code" by myself --BenH
]
CC: <stable@vger.kernel.org> [v3.4+]
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
ibmebus is the last remaining user of of_platform_driver and the
conversion to a regular platform driver is trivial.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
In commit 59affcd I added context switching of more PMU SPRs, because
they are potentially exposed to userspace on Power8. However despite me
being a smart arse in the commit message it's actually not correct. In
particular it interacts badly with a global perf record.
We will have to do something more complicated, but that will have to
wait for 3.11.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When introducing support for DABRX in 4474ef0, we broke older 32-bit CPUs
that don't have that register.
Some CPUs have a DABR but not DABRX. Configuration are:
- No 32bit CPUs have DABRX but some have DABR.
- POWER4+ and below have the DABR but no DABRX.
- 970 and POWER5 and above have DABR and DABRX.
- POWER8 has DAWR, hence no DABRX.
This introduces CPU_FTR_DABRX and sets it on appropriate CPUs. We use
the top 64 bits for CPU FTR bits since only 64 bit CPUs have this.
Processors that don't have the DABRX will still work as they will fall
back to software filtering these breakpoints via perf_exclude_event().
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: "Gorelik, Jacob (335F)" <jacob.gorelik@jpl.nasa.gov>
cc: stable@vger.kernel.org (v3.9 only)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
POWER8 can take a denormalisation exception on any VSX registers.
This does the extra 32 VSX registers we don't currently handle.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The following simplifies the denorm code by using macros to generate the long
stream of almost identical instructions.
This patch results in no changes to the output binary, but removes a lot of
lines of code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In 2ac6f42 powerpc/cputable: Fix oprofile_cpu_type on power8
we broke all power8 hw events.
This reverts this change and uses oprofile_type instead. Perf now works
on POWER8 again and oprofile will revert to using timers on POWER8.
Kudos to mpe this fix.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If a BAR has the value of 0, we would assume that it is unset yet and
then mark the resource as unset and would reassign it later. But after
commit 6c5705fe (powerpc/PCI: get rid of device resource fixups)
the pcibios_fixup_resources is invoked after the bus address was
translated to linux resource. So the value of res->start is resource
address. And since the resource and bus address may be different, we
should translate it to the bus address before doing the check.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the new pci_alloc_dev(bus) to replace the existing using of
alloc_pci_dev(void).
[bhelgaas: drop pci_bus ref later in pci_release_dev()]
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Neela Syam Kolli <megaraidlinux@lsi.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fix a typo in setting COMMON_USER2_POWER7 bits to .cpu_user_features2
cpu specs table.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The codes which ever used these two variables have gone. Throw away
them too.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These comments already don't apply to the current code. So just remove
them.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit a9c4e541ea
"powerpc/kprobe: Complete kprobe and migrate exception frame"
introduced a regression:
While returning from exception handling in case of PREEMPT enabled,
_TIF_NEED_RESCHED bit is checked in TI_FLAGS (thread_info flag) of current
task. Only if this bit is set, it should continue with the process of
calling preempt_schedule_irq() to schedule highest priority task if
available.
Current code assumes that r8 contains TI_FLAGS and check this for
_TIF_NEED_RESCHED, but as r8 is modified in the code which executes before
this check, r8 no longer contains the expected TI_FLAGS information.
As a result check for comparison with _TIF_NEED_RESCHED was failing even if
NEED_RESCHED bit is set in the current thread_info flag. Due to this,
preempt_schedule_irq() and in turn scheduler was not getting called even if
highest priority task is ready for execution.
So, store temporary results in r0 instead of r8 to prevent r8 from getting
modified as subsequent code is dependent on its value.
Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
CC: <stable@vger.kernel.org> [v3.7+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On context switch, we should have no prefetch streams leak from one
userspace process to another. This frees up prefetch resources for the
next process.
Based on patch from Milton Miller.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>