- Better machine check handling for HV KVM
- Ability to support guests with threads=2, 4 or 8 on POWER9
- Fix for a race that could cause delayed recognition of signals
- Fix for a bug where POWER9 guests could sleep with interrupts
pending.
* pci/resource:
PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11
PCI: Do not disregard parent resources starting at 0x0
Conflicts:
arch/x86/pci/fixup.c
* pci/pm:
PCI/PM: Avoid using device_may_wakeup() for runtime PM
x86/PCI: Avoid AMD SB7xx EHCI USB wakeup defect
PCI/PM: Restore the status of PCI devices across hibernation
drm/radeon: make MacBook Pro d3_delay quirk more generic
drm/amdgpu: remove unnecessary save/restore of pdev->d3_delay
PCI/PM: Add needs_resume flag to avoid suspend complete optimization
PCI: imx6: Fix config read timeout handling
switchtec: Fix minor bug with partition ID register
switchtec: Use new cdev_device_add() helper function
PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
Update the Hyper-V vPCI driver to use the Server-2016 version of the vPCI
protocol, fixing MSI creation and retargeting issues.
Signed-off-by: Jork Loeser <jloeser@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
With the introduction of struct pci_host_bridge.map_irq pointer it is
possible to assign IRQs for all devices originating from a PCI host bridge
at probe time; this is implemented through pci_assign_irq() that relies on
the struct pci_host_bridge.map_irq pointer to map IRQ for a given device.
The benefits this brings are twofold:
- the IRQ for a device is assigned once at probe time
- the IRQ assignment works also for hotplugged devices
With all DT based PCI host bridges converted to the struct
pci_host_bridge.{map/swizzle}_irq hooks mechanism the DT IRQ allocation in
ARM64 pcibios_alloc_irq() is now redundant and can be removed.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Legacy PCI host controllers (ie host controllers that set-up the PCI bus
through the ARM pci_common_init() API) are currently relying on
pci_fixup_irqs() to assign legacy PCI irqs to devices. This is not ideal
in that pci_fixup_irqs() assigns IRQs for all PCI devices present in a given
system some of which may well be enabled by the time pci_fixup_irqs() is
called (ie a system with multiple host controllers). With the introduction
of struct pci_host_bridge.(*map_irq) pointer it is possible to assign IRQs
for all devices originating from a PCI host bridge at probe time; this is
implemented through pci_assign_irq() that relies on the struct
pci_host_bridge.map_irq pointer to map IRQ for a given device.
The benefits this brings are twofold:
- the IRQ for a device is assigned once at probe time
- the IRQ assignment works also for hotplugged devices
Remove pci_fixup_irqs() call from bios32 code and rely on pci_assign_irq()
to carry out the IRQ mapping at device probe time.
The map_irq() and swizzle_irq() struct pci_host_bridge callbacks are set-up
in the struct pci_host_bridge created in the bios32 pcibios_init_hw()
function and mach-* code paths (for PCI mach implementations that require a
specific struct hw_pci.(*scan) function callback).
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
[bhelgaas: folded in fixes from Lorenzo:
http://lkml.kernel.org/r/20170701140629.GC8977@red-moon]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Andrew Lunn <andrew@lunn.ch>
When a process runs out of stack the parisc kernel wrongly faults with SIGBUS
instead of the expected SIGSEGV signal.
This example shows how the kernel faults:
do_page_fault() command='a.out' type=15 address=0xfaac2000 in libc-2.24.so[f8308000+16c000]
trap #15: Data TLB miss fault, vm_start = 0xfa2c2000, vm_end = 0xfaac2000
The vma->vm_end value is the first address which does not belong to the vma, so
adjust the check to include vma->vm_end to the range for which to send the
SIGSEGV signal.
This patch unbreaks building the debian libsigsegv package.
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
Architectures with a compat syscall table must put compat_sys_keyctl()
in it, not sys_keyctl(). The parisc architecture was not doing this;
fix it.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Pull MIPS fixes from Ralf Baechle:
"Here's a final round of fixes for 4.12:
- Fix misordered instructions in assembly code making kenel startup
via UHB unreliable.
- Fix special case of MADDF and MADDF emulation.
- Fix alignment issue in address calculation in pm-cps on 64 bit.
- Fix IRQ tracing & lockdep when rescheduling
- Systems with MAARs require post-DMA cache flushes.
The reordering fix and the MADDF/MSUBF fix have sat in linux-next for
a number of days. The others haven't propagated from my pull tree to
linux-next yet but all have survived manual testing and Imagination's
automated test system and there are no pending bug reports"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Avoid accidental raw backtrace
MIPS: Perform post-DMA cache flushes on systems with MAARs
MIPS: Fix IRQ tracing & lockdep when rescheduling
MIPS: pm-cps: Drop manual cache-line alignment of ready_count
MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately
MIPS: head: Reorder instructions missing a delay slot
Pull ARM fix from Russell King:
"One final fix for 4.12 - Doug found a boot failure case triggered by
requesting a non-even MB vmalloc size"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8685/1: ensure memblock-limit is pmd-aligned
On POWER9 SMT8 the 24x7 API returns two result elements for physical core
and virtual CPU events and we need to add their counts to get the final
result.
Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
POWER9 introduces a new version of the hypervisor API to access the 24x7
perf counters. The new version changed some of the structures used for
requests and results.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There's an H24x7_DATA_BUFFER_SIZE constant, so use it in init_24x7_request.
There's also an HV_PERF_DOMAIN_MAX constant, so use it in
h_24x7_event_init. This makes the comment above the check redundant,
so remove it.
In add_event_to_24x7_request, a statement is terminated with a comma
instead of a semicolon. Fix it.
In hv-24x7.h, improve comments in struct hv_24x7_result.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The H_GET_24X7_CATALOG_PAGE hcall can return a signed error code, so fix
this in the code.
The H_GET_24X7_DATA hcall can return a signed error code, so fix this in
the code. Also, don't truncate it to 32 bit to use as return value for
make_24x7_request. In case of error h_24x7_event_commit_txn passes that
return value to generic code, so it should be a proper errno. The other
caller of make_24x7_request is single_24x7_request, whose callers don't
actually care which error code is returned so they are not affected by this
change.
Finally, h_24x7_get_value doesn't use the error code from
single_24x7_request, so there's no need to store it.
Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
make_24x7_request already calls log_24x7_hcall if it fails, so callers
don't have to do it again.
In fact, since the latter is now only called from the former, there's no
need for a separate log_24x7_hcall anymore so remove it.
Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
hv-24x7.h has a comment mentioning that result_buffer->results can't be
indexed as a normal array because it may contain results of variable sizes,
so fix the loop in h_24x7_event_commit_txn to take the variation into
account when iterating through results.
Another problem in that loop is that it sets h24x7hw->events[i] to NULL.
This assumes that only the i'th result maps to the i'th request, but that
is not guaranteed to be true. We need to leave the event in the array so
that we don't dereference a NULL pointer in case more than one result maps
to one request.
We still assume that each result has only one result element, so warn if
that assumption is violated.
Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
request_buffer can hold 254 requests, so if it already has that number of
entries we can't add a new one.
Also, define constant to show where the number comes from.
Fixes: e3ee15dc5d ("powerpc/perf/hv-24x7: Define add_event_to_24x7_request()")
Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
H_GET_24X7_CATALOG_PAGE needs to be passed the version number obtained from
the first catalog page obtained previously. This is a 64 bit number, but
create_events_from_catalog truncates it to 32-bit.
This worked on POWER8, but POWER9 actually uses the upper bits so the call
fails with H_P3 because the hypervisor doesn't recognize the version.
This patch also adds the hcall return code to the error message, which is
helpful when debugging the problem.
Fixes: 5c5cd7b502 ("powerpc/perf/hv-24x7: parse catalog and populate sysfs with events")
Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Flip the switch. Running around and screaming "IT'S ALIVE" is optional,
but recommended.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Adds support for removing bolted (i.e kernel linear mapping) mappings on
powernv. This is needed to support memory hot unplug operations which
are required for the teardown of DAX/PMEM devices.
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S. This
is used to differentiate device backed memory from transparent huge
pages since they are handled in more or less the same manner by the core
mm code.
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Adds support to powerpc for the altmap feature of ZONE_DEVICE memory. An
altmap is a driver provided region that is used to provide the backing
storage for the struct pages of ZONE_DEVICE memory. In situations where
large amount of ZONE_DEVICE memory is being added to the system the
altmap reduces pressure on main system memory by allowing the mm/
metadata to be stored on the device itself rather in main memory.
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Removes an indentation level and shuffles some code around to make the
following patch cleaner. No functional changes.
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently ZONE_DEVICE depends on X86_64 and this will get unwieldly as
new architectures (and platforms) get ZONE_DEVICE support. Move to an
arch selected Kconfig option to save us the trouble.
Cc: linux-mm@kvack.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Export it so it can be referenced inside a module.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Use the different spin loop primitives in some simple powerpc
spin loops, including those which will spin as a common case.
This will help to test the spin loop primitives before more
conversions are done.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add some includes of <linux/processor.h>]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This reverts commits 2c0cba482e ("arm: sun8i: sunxi-h3-h5: Add dt node
for the syscon control module") to 2428fd0fe5 ("arm64: defconfig: Enable
dwmac-sun8i driver on defconfig") and 3432a86e64 ("arm: sun8i:
orangepipc: use internal phy-mode") to 5a79b4f2a5 ("arm: sun8i:
orangepi-2: use internal phy-mode") that should be merged
through the arm-soc tree, and end up in merge conflicts and build failures.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull x86 fixes from Thomas Gleixner:
"Fixlets for x86:
- Prevent kexec crash when KASLR is enabled, which was caused by an
address calculation bug
- Restore the freeing of PUDs on memory hot remove
- Correct a negated pointer check in the intel uncore performance
monitoring driver
- Plug a memory leak in an error exit path in the RDT code"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel_rdt: Fix memory leak on mount failure
x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug
x86/boot/KASLR: Add checking for the offset of kernel virtual address randomization
perf/x86/intel/uncore: Fix wrong box pointer check
x86/mm/hotplug: Fix BUG_ON() after hot-remove by not freeing PUD
At present, interrupts are hard-disabled fairly late in the guest
entry path, in the assembly code. Since we check for pending signals
for the vCPU(s) task(s) earlier in the guest entry path, it is
possible for a signal to be delivered before we enter the guest but
not be noticed until after we exit the guest for some other reason.
Similarly, it is possible for the scheduler to request a reschedule
while we are in the guest entry path, and we won't notice until after
we have run the guest, potentially for a whole timeslice.
Furthermore, with a radix guest on POWER9, we can take the interrupt
with the MMU on. In this case we end up leaving interrupts
hard-disabled after the guest exit, and they are likely to stay
hard-disabled until we exit to userspace or context-switch to
another process. This was masking the fact that we were also not
setting the RI (recoverable interrupt) bit in the MSR, meaning
that if we had taken an interrupt, it would have crashed the host
kernel with an unrecoverable interrupt message.
To close these races, we need to check for signals and reschedule
requests after hard-disabling interrupts, and then keep interrupts
hard-disabled until we enter the guest. If there is a signal or a
reschedule request from another CPU, it will send an IPI, which will
cause a guest exit.
This puts the interrupt disabling before we call kvmppc_start_thread()
for all the secondary threads of this core that are going to run vCPUs.
The reason for that is that once we have started the secondary threads
there is no easy way to back out without going through at least part
of the guest entry path. However, kvmppc_start_thread() includes some
code for radix guests which needs to call smp_call_function(), which
must be called with interrupts enabled. To solve this problem, this
patch moves that code into a separate function that is called earlier.
When the guest exit is caused by an external interrupt, a hypervisor
doorbell or a hypervisor maintenance interrupt, we now handle these
using the replay facility. __kvmppc_vcore_entry() now returns the
trap number that caused the exit on this thread, and instead of the
assembly code jumping to the handler entry, we return to C code with
interrupts still hard-disabled and set the irq_happened flag in the
PACA, so that when we do local_irq_enable() the appropriate handler
gets called.
With all this, we now have the interrupt soft-enable flag clear while
we are in the guest. This is useful because code in the real-mode
hypercall handlers that checks whether interrupts are enabled will
now see that they are disabled, which is correct, since interrupts
are hard-disabled in the real-mode code.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Since commit b009031f74 ("KVM: PPC: Book3S HV: Take out virtual
core piggybacking code", 2016-09-15), we only have at most one
vcore per subcore. Previously, the fact that there might be more
than one vcore per subcore meant that we had the notion of a
"master vcore", which was the vcore that controlled thread 0 of
the subcore. We also needed a list per subcore in the core_info
struct to record which vcores belonged to each subcore. Now that
there can only be one vcore in the subcore, we can replace the
list with a simple pointer and get rid of the notion of the
master vcore (and in fact treat every vcore as a master vcore).
We can also get rid of the subcore_vm[] field in the core_info
struct since it is never read.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Intel PT:
- Support "ptwrite" instructio, a way to stuff 32 or 64 bit values into
the Intel PT trace (Adrian Hunter)
- Support power events in Intel PT to report changes to C-state (Adrian
Hunter)
- Synthesize Intel PT events as PERF_RECORD_SAMPLE records with a
perf_event_attr.type (PERF_TYPE_SYNTH) just after the range used by the
kernel, i.e. right after what is allocated for PMUs, at INT_MAX + 1U,
attr.config will have the identification for the synthesized event and
the PERF_SAMPLE_RAW payload will have its fields (Adrian Hunter)
Infrastructure:
- Remove warning() and error(), using instead pr_warning() and
pr_error(), consolidating error reporting (Arnaldo Carvalho de Melo)
- Add platform dependency to 'perf test 15' (Thomas Richter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZVsurAAoJENZQFvNTUqpAnYYP/i44/Y99vfN751fuTlJYci2g
u1VVRsd0GC8OnFIZKRzFumAd+IXRUXiLp25nP36yvsXNOMHGU1O/SQmRRHOC6zTY
ffPmnlHeUT8LOVX82GiiG6E6rzE2KHuAbgILvzswelPoyT6/91mysoZMu2xHpy3f
sLUtjN7gAZqy6nMNTiGgItUDyFIAl4c2iQf5v8YkxfM0UxekXt/XIj2Zn5uUXTIW
q9B0po9/MneI+7Fqtj3YTN7owY0YhXmynKHzE7CseNyGFFbtIzoTLW3qgtz+Ld3M
ip0QcsRiV6hbgEkPsi6nwOAF1EABlsHb4QHwFifVqzWCPwqeLmI3rd7FsONDNcCZ
TVoHfm1wlgqtQw6KVQodIrTKCq7DOpjTIzk6AX980vJ8yp2KtWf2DB0AqwpJ/7R2
2nqTsLm9iWbPOTA0mp/7au/WbNDcgL9jv2yqU8/UGBg92tVlVN5IiAVVpnsdBJgi
VjEeUdqbvs9aw//+L1uN0N7Y22zqpQAm/eomd9wwXzDHCeWjIcrIR4tDA5i22waH
4XFJLgJhfbTZsSGonpQ+7GVPzFru3rz56wAM4UbD3BRtVCj+EMPu0/mb9u3URgjp
1iJdOm7WY/XH7AYV5dXnZyR+o4VDHwuziw5yxvoR3RNpARxAjVFGzXfq6Q5DbHPS
mycD8rcoQp+3IeyA/IEN
=tvJF
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-4.13-20170630' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
Intel PT enhancements:
- Support "ptwrite" instruction, a way to stuff 32 or 64 bit values into
the Intel PT trace (Adrian Hunter)
- Support power events in Intel PT to report changes to C-state (Adrian
Hunter)
- Synthesize Intel PT events as PERF_RECORD_SAMPLE records with a
perf_event_attr.type (PERF_TYPE_SYNTH) just after the range used by the
kernel, i.e. right after what is allocated for PMUs, at INT_MAX + 1U,
attr.config will have the identification for the synthesized event and
the PERF_SAMPLE_RAW payload will have its fields (Adrian Hunter)
Infrastructure changes:
- Remove warning() and error(), using instead pr_warning() and
pr_error(), consolidating error reporting (Arnaldo Carvalho de Melo)
- Add platform dependency to 'perf test 15' (Thomas Richter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With the new task struct randomization, we can run into a build
failure for certain random seeds, which will place fields beyond
the allow immediate size in the assembly:
arch/arm/kernel/entry-armv.S: Assembler messages:
arch/arm/kernel/entry-armv.S:803: Error: bad immediate value for offset (4096)
Only two constants in asm-offset.h are affected, and I'm changing
both of them here to work correctly in all configurations.
One more macro has the problem, but is currently unused, so this
removes it instead of adding complexity.
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[kees: Adjust commit log slightly]
Signed-off-by: Kees Cook <keescook@chromium.org>
Two fixes for code we merged this cycle:
- cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
- Avoid miscompilation w/GCC 4.6.3 on 32-bit - don't inline copy_to/from_user()
Thanks to:
Al Viro, Larry Finger, Christophe Lombard.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZViCDAAoJEFHr6jzI4aWA5eMQAMBGbDU3k+OHT2kuZg1Obnyo
HADdBg1ZcCZ4MI0xOTiFb4ETsUcXcazGle6N1z/RjNYLA0KJobV5b+t/i+ybGtz2
0a+35j7G7i+rxBMkWFfGUgZewwWPZkOry4BmXyQHHHeVnEOyF6jj/pbm22oedf1o
NCogUbWKhxm2YqYzftfur09dG00T59mAKQ7BeHMkhR3p6lbOD/sMZPiquXO2cV2C
78buxYCl1SqAx2yyPrmSBbVxUF5+PKvANaniQL+jYe7fC9GVNUoJJ5Dh0NCgvqKJ
r9u8/1K9hSCAZDGhOWePPCFnqLH4hnyFN8m8S94tMNFnK3VDhoy+45GJ+7x6RCGH
7Xvi6qef6n2jqrj7pggsPu3NKGtd8mmBVcPOxjdyPI6R2QZeRbdrx7NyvNB3xDDF
rUsju/aHjJJPKDIq4hbDJTMSWQMe5+Bb8aEKOYupEQ/X//MFqz8gukVcQCJNU6Pn
0TbOE+FUSgICY8IB2rI7UBa+rKKM8VDcg1rz0YYSCGfDOccMfq9IxAlihe4y3fpz
KzuKnkCQBVT6+Q6AayqZlqVttWU+eIG/cm9dHS9bPXDKb0XyoOSl0ZcytflmlFR9
xsZxD7/69DoRpdV0t0kpiLK9lWd3QhPaSukhn/aoUGXsFcMeJTYpsinuvVNi3hFh
ldhIKrQbvY7k0s7xGOCi
=Yq9i
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.12-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Hopefully the last two powerpc fixes for 4.12.
The CXL one is larger than I'd usually send at rc7, but it fixes new
code this cycle, so better to have it working for the release. It was
actually sent a few weeks back but got blocked in testing behind
another fix that was causing issues.
We are still tracking one crash in v4.12-rc7, but only one person has
reproduced it and the commit identified by bisect doesn't touch any of
the relevant code, so I think it's 50/50 whether that commit is
actually the problem or it's some code layout / toolchain issue.
Two fixes for code we merged this cycle:
- cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
- Avoid miscompilation w/GCC 4.6.3 on 32-bit - don't inline
copy_to/from_user()
Thanks to Al Viro, Larry Finger, Christophe Lombard"
* tag 'powerpc-4.12-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/32: Avoid miscompilation w/GCC 4.6.3 - don't inline copy_to/from_user()
cxl: Fixes for Coherent Accelerator Interface Architecture 2.0
Two fixes:
* A fix for AMD IOMMU interrupt remapping code when
IRQs are forwarded directly to KVM guests
* Fixed check in the recently merged code to allow
tboot with Intel VT-d disabled
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZVoIAAAoJECvwRC2XARrjpYoP/0JGpnrtGsw2GnXbe4jCXwWg
hWpJze0QyHi7oJrNHRK1W72rwQxp53NUXS942bosXj18U+KU3x3UnLQfU5u1RfEi
Z44riGpurZ2+Ied+X01eS5ALeno9/jj7DOnSDsrZS2vI56aJTGISNO/8xvE5M5Mj
xwvOQW+K1Wgqac2Sdi+ckt8tUhik1MKbkL8k04/27Yzq6FxvkEyiHXDQcXA+eGId
ewW034Z2ueriZ6fzuanQW3j57albeIN6XsTaOHbwD2edQwiyZ6yoMakKgMuHnpgx
F4BtPtcbgSHSQ48ZZb9bdQpvEO3lY0jmSlMI4S2Fu5DmSIKC/KOy/4yYUhlQtrHT
UUbIvq/pC+SMPxZiZAJhyIFcV6YkTelArPFx+QxsWvMMiXeGnezgeFsAOzwZuptF
FFm9ItgfPkGxkiECFwJwSAHsTiiFocRfYHHv/ace/6X4ZB+nZrl3mSfX7+EtT2LG
Kje2XtUzoGR/8LSBTMlQQeurhBZwbnoaFEtiVMrLODhvFT3IU7B00wgQHWNpjyRj
Rqe/ScHRdRF1NQtW6QDTpNU4rZGB4lt1WxpMlONVVpO4LI/fXZq8Nq4lT+FzUHV1
xNQEh9xV2DK7bjT3HDd/OKSusaLzImNFYmi6+t/gROX8z/PqISb5IOmjaiO9pYYM
coSJHmYL1hc0yFsNp6eB
=bw0V
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Two fixes:
- A fix for AMD IOMMU interrupt remapping code when IRQs are
forwarded directly to KVM guests
- Fixed check in the recently merged code to allow tboot with
Intel VT-d disabled"
* tag 'iommu-fixes-v4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix interrupt remapping when disable guest_mode
iommu/vt-d: Correctly disable Intel IOMMU force on
DMA operations for NOMMU case have been just factored out into
separate compilation unit, so don't keep dead code.
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Now, we have dedicated non-cacheable region for consistent DMA
operations. However, that region can still be marked as bufferable by
MPU, so it'd be safer to have barriers by default. M-class machines
that didn't need it until now also likely won't need it in the future,
therefore, we offer this as an option.
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
R/M classes of cpus can have memory covered by MPU which in turn might
configure RAM as Normal i.e. bufferable and cacheable. It breaks
dma_alloc_coherent() and friends, since data can stuck in caches now
or be buffered.
This patch factors out DMA support for NOMMU configuration into
separate entity which provides dedicated dma_ops. We have to handle
there several cases:
- configurations with MMU/MPU setup
- configurations without MMU/MPU setup
- special case for M-class, since caches and MPU there are optional
In general we rely on default DMA area for coherent allocations or/and
per-device memory reserves suitable for coherent DMA, so if such
regions are set coherent allocations go from there.
In case MMU/MPU was not setup we fallback to normal page allocator for
DMA memory allocation.
In case we run M-class cpus, for configuration without cache support
(like Cortex-M3/M4) dma operations are forced to be coherent and wired
with dma-noop (such decision is made based on cacheid global
variable); however, if caches are detected there and no DMA coherent
region is given (either default or per-device), dma is disallowed even
MPU is not set - it is because M-class implement system memory map
which defines part of address space as Normal memory.
Reported-by: Alexandre Torgue <alexandre.torgue@st.com>
Reported-by: Andras Szemzo <sza@esh.hu>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
[hch: removed the dma_supported() implementation that isn't required anymore]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Currently, internals of dma_common_mmap() is compiled out if build is
done for either NOMMU or target which explicitly says it does not
have/want coherent DMA mmap. It turned out that dma_common_mmap() can
be handy in NOMMU setup (at least for ARM).
This patch converts exitent NOMMU targets to use ARCH_NO_COHERENT_DMA_MMAP,
thus when CONFIG_MMU is gone from dma_common_mmap() their behaviour stays
unchanged.
ARM is not converted to ARCH_NO_COHERENT_DMA_MMAP because it 1)
already has mmap callback which can handle (at some extent) NOMMU 2)
already defines dummy pgprot_noncached() for NOMMU build.
c6x and frv stay untouched since they already have ARCH_NO_COHERENT_DMA_MMAP.
Cc: Steven Miao <realmz6@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
On an AMD Carrizo laptop, when EHCI runtime PM is enabled, EHCI ports do
not assert PME# for device plug/unplug events while in D3.
As Alan Stern points out [1], the PME signal is not enabled when controller
is in D3, therefore it's not being woken up when new devices get plugged
in.
Testing shows PME signal works when the EHCI power state is D2.
Clear the PCI_PM_CAP_PME_D3 and PCI_PM_CAP_PME_D3cold bits in
dev->pme_support to indicate the device will not assert PME# from those
states.
[1] http://lkml.kernel.org/r/Pine.LNX.4.44L0.1706121010010.2092-100000@iolanthe.rowland.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196091
Link: https://support.amd.com/TechDocs/46837.pdf (Section 23)
Link: https://support.amd.com/TechDocs/42413.pdf (Appendix A2)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
[bhelgaas: changelog, add parens in quirk]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
struct jit_ctx::image is used the store a pointer to the jitted
intructions, which are always little-endian. These instructions
are thus correctly converted from native order to little-endian
before being stored but the pointer 'image' is declared as for
native order values.
Fix this by declaring the field as __le32* instead of u32*.
Same for the pointer used in jit_fill_hole() to initialize
the image.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by <linux/sysfs.h> work with const
attribute_group. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The macro insn_fetch marks the 'type' argument as having a specified
alignment. Type attributes can only be applied to structs, unions, or
enums, but insn_fetch is only ever invoked with integral types, so Clang
produces 19 -Wignored-attributes warnings for this source file.
Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- vcpu request overhaul
- allow timer and PMU to have their interrupt number
selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAllWCM0VHG1hcmMuenlu
Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDjJ0QAI16x6+trKhH31lTSYekYfqm4hZ2
Fp7IbALW9KNCaY35tZov2Zuh99qGRduxTh7ewqhKpON8kkU+UKj0F7zH22+vfN4m
yas/+uNr8R9VLyvea4ysPsgx8Q8v1Ix9setohHYNZIL9/klVqtaHpYvArHVF/mzq
p2j/NxRS2dlp9r2TtoMRMhA05u6r0wolhUuh+z9v2ipib0gfOBIG24jsqCTEcD9n
5A/cVd+ztYshkrV95h3y9peahwt3zOA4QBGzrQ2K25jp0s54nqhmC7JTNSa8dtar
YGW2MuAMoIFTwCFAlpwCzrwpOJFzF3Q6A8bOxei2fjclzjPMgT1xQxuhOoe4ntFa
lTPxSHalm5W6dFTW90YSo2DBcPe+N7sQkhjR0cCeY3GYsOFhXMLTlOl5Pt1YK1or
+3FAI74tFRKvVmb9mhZeGTvuzhDgRvtf3Qq5rjwlGzKc2BBOEgtMyj/Wgwo4N6Dz
IjOnoRaUGELoBCWoTorMxLpsPBdPVSUxNyJTdAhqZ/ZtT1xqjhFNLZcrVWmOTzDM
1cav+jZkla4sLmJSNDD54aCSvvtPHis0nZn9PRlh12xgOyYiAVx4K++MNuWP0P37
hbh1gbPT+FcoVxPurUsX/pjNlTucPZcBwFytZDQlpwtPBpEFzJiImLYe/PldRb0f
9WQOH1Y1+q14MF+N
=6hNK
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM updates for 4.13
- vcpu request overhaul
- allow timer and PMU to have their interrupt number
selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
Conflicts:
arch/s390/include/asm/kvm_host.h
In preparation for an objtool rewrite which will have broader checks,
whitelist functions and files which cause problems because they do
unusual things with the stack.
These whitelists serve as a TODO list for which functions and files
don't yet have undwarf unwinder coverage. Eventually most of the
whitelists can be removed in favor of manual CFI hint annotations or
objtool improvements.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The comment describes the old explicit IPI-based flush logic, which
is long gone.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/55e44997e56086528140c5180f8337dc53fb7ffc.1498751203.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It was historically possible to have two concurrent TLB flushes
targetting the same CPU: one initiated locally and one initiated
remotely. This can now cause an OOPS in leave_mm() at
arch/x86/mm/tlb.c:47:
if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK)
BUG();
with this call trace:
flush_tlb_func_local arch/x86/mm/tlb.c:239 [inline]
flush_tlb_mm_range+0x26d/0x370 arch/x86/mm/tlb.c:317
Without reentrancy, this OOPS is impossible: leave_mm() is only
called if we're not in TLBSTATE_OK, but then we're unexpectedly
in TLBSTATE_OK in leave_mm().
This can be caused by flush_tlb_func_remote() happening between
the two checks and calling leave_mm(), resulting in two consecutive
leave_mm() calls on the same CPU with no intervening switch_mm()
calls.
We never saw this OOPS before because the old leave_mm()
implementation didn't put us back in TLBSTATE_OK, so the assertion
didn't fire.
Nadav noticed the reentrancy issue in a different context, but
neither of us realized that it caused a problem yet.
Reported-by: Levin, Alexander (Sasha Levin) <alexander.levin@verizon.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Nadav Amit <nadav.amit@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: linux-mm@kvack.org
Fixes: 3d28ebceaf ("x86/mm: Rework lazy TLB to track the actual loaded mm")
Link: http://lkml.kernel.org/r/855acf733268d521c9f2e191faee2dcc23a29729.1498751203.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
According to the Intel datasheet, the REP MOVSB instruction
exposes a pretty heavy setup cost (50 ticks), which hurts
short string copy operations.
This change tries to avoid this cost by calling the explicit
loop available in the unrolled code for strings shorter
than 64 bytes.
The 64 bytes cutoff value is arbitrary from the code logic
point of view - it has been selected based on measurements,
as the largest value that still ensures a measurable gain.
Micro benchmarks of the __copy_from_user() function with
lengths in the [0-63] range show this performance gain
(shorter the string, larger the gain):
- in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4
- in the [72%-9%] range on Intel Core i7-4810MQ
Other tested CPUs - namely Intel Atom S1260 and AMD Opteron
8216 - show no difference, because they do not expose the
ERMS feature bit.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com
[ Clarified the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A few minor clean-ups: constify the lbr_desc[] array and make
local function lbr_from_signext_quirk_rd() static to fix a sparse warning:
"symbol 'lbr_from_signext_quirk_rd' was not declared. Should it be static?"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629091406.9870-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
KASLR uses hack to detect whether we booted via startup_32() or
startup_64(): it checks what is loaded into cr3 and compares it to
_pgtables. _pgtables is the array of page tables where early code
allocates page table from.
KASLR expects cr3 to point to _pgtables if we booted via startup_32(), but
that's not true if we booted with 5-level paging enabled. In this case top
level page table is allocated separately and only the first p4d page table
is allocated from the array.
Let's modify the check to cover both 4- and 5-level paging cases.
The patch also renames 'level4p' to 'top_level_pgt' as it now can hold
page table for 4th or 5th level, depending on configuration.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170628121730.43079-1-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Kernel text KASLR is separated into physical address and virtual
address randomization. And for virtual address randomization, we
only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE.
So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR,
but not the original kernel loading address 'output'.
The bug will cause kernel boot failure if kernel is loaded at a different
position than the address, 16M, which is decided at compiled time.
Kexec/kdump is such practical case.
To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial
value.
Tested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
Link: http://lkml.kernel.org/r/1498567146-11990-3-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For kernel text KASLR, the virtual address is confined to area of 1G,
[0xffffffff80000000, 0xffffffffc0000000). For the implemenataion of
virtual address randomization, we only randomize to get an offset
between 16M and 1G, then add this offset to the starting address,
0xffffffff80000000. Here 16M is the offset which is decided at linking
stage. So the amount of the local variable 'virt_addr' which respresents
the offset plus the kernel output size can not exceed KERNEL_IMAGE_SIZE.
Add a debug check for the offset. If out of bounds, print error
message and hang there.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1498567146-11990-2-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since commit 81a76d7119 ("MIPS: Avoid using unwind_stack() with
usermode") show_backtrace() invokes the raw backtracer when
cp0_status & ST0_KSU indicates user mode to fix issues on EVA kernels
where user and kernel address spaces overlap.
However this is used by show_stack() which creates its own pt_regs on
the stack and leaves cp0_status uninitialised in most of the code paths.
This results in the non deterministic use of the raw back tracer
depending on the previous stack content.
show_stack() deals exclusively with kernel mode stacks anyway, so
explicitly initialise regs.cp0_status to KSU_KERNEL (i.e. 0) to ensure
we get a useful backtrace.
Fixes: 81a76d7119 ("MIPS: Avoid using unwind_stack() with usermode")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15+
Patchwork: https://patchwork.linux-mips.org/patch/16656/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Recent CPUs from Imagination Technologies such as the I6400 or P6600 are
able to speculatively fetch data from memory into caches. This means
that if used in a system with non-coherent DMA they require that caches
be invalidated after a device performs DMA, and before the CPU reads the
DMA'd data, in order to ensure that stale values weren't speculatively
prefetched.
Such CPUs also introduced Memory Accessibility Attribute Registers
(MAARs) in order to control the regions in which they are allowed to
speculate. Thus we can use the presence of MAARs as a good indication
that the CPU requires the above cache maintenance. Use the presence of
MAARs to determine the result of cpu_needs_post_dma_flush() in the
default case, in order to handle these recent CPUs correctly.
Note that the return type of cpu_needs_post_dma_flush() is changed to
bool, such that it's clearer what's happening when cpu_has_maar is cast
to bool for the return value. If this patch were backported to a
pre-v4.7 kernel then MIPS_CPU_MAAR was 1ull<<34, so when cast to an int
we would incorrectly return 0. It so happens that MIPS_CPU_MAAR is
currently 1ull<<30, so when truncated to an int gives a non-zero value
anyway, but even so the implicit conversion from long long int to bool
makes it clearer to understand what will happen than the implicit
conversion from long long int to int would. The bool return type also
fits this usage better semantically, so seems like an all-round win.
Thanks to Ed for spotting the issue for pre-v4.7 kernels & suggesting
the return type change.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Ed Blake <ed.blake@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16363/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When the scheduler sets TIF_NEED_RESCHED & we call into the scheduler
from arch/mips/kernel/entry.S we disable interrupts. This is true
regardless of whether we reach work_resched from syscall_exit_work,
resume_userspace or by looping after calling schedule(). Although we
disable interrupts in these paths we don't call trace_hardirqs_off()
before calling into C code which may acquire locks, and we therefore
leave lockdep with an inconsistent view of whether interrupts are
disabled or not when CONFIG_PROVE_LOCKING & CONFIG_DEBUG_LOCKDEP are
both enabled.
Without tracing this interrupt state lockdep will print warnings such
as the following once a task returns from a syscall via
syscall_exit_partial with TIF_NEED_RESCHED set:
[ 49.927678] ------------[ cut here ]------------
[ 49.934445] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3687 check_flags.part.41+0x1dc/0x1e8
[ 49.946031] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
[ 49.946355] CPU: 0 PID: 1 Comm: init Not tainted 4.10.0-00439-gc9fd5d362289-dirty #197
[ 49.963505] Stack : 0000000000000000 ffffffff81bb5d6a 0000000000000006 ffffffff801ce9c4
[ 49.974431] 0000000000000000 0000000000000000 0000000000000000 000000000000004a
[ 49.985300] ffffffff80b7e487 ffffffff80a24498 a8000000ff160000 ffffffff80ede8b8
[ 49.996194] 0000000000000001 0000000000000000 0000000000000000 0000000077c8030c
[ 50.007063] 000000007fd8a510 ffffffff801cd45c 0000000000000000 a8000000ff127c88
[ 50.017945] 0000000000000000 ffffffff801cf928 0000000000000001 ffffffff80a24498
[ 50.028827] 0000000000000000 0000000000000001 0000000000000000 0000000000000000
[ 50.039688] 0000000000000000 a8000000ff127bd0 0000000000000000 ffffffff805509bc
[ 50.050575] 00000000140084e0 0000000000000000 0000000000000000 0000000000040a00
[ 50.061448] 0000000000000000 ffffffff8010e1b0 0000000000000000 ffffffff805509bc
[ 50.072327] ...
[ 50.076087] Call Trace:
[ 50.079869] [<ffffffff8010e1b0>] show_stack+0x80/0xa8
[ 50.086577] [<ffffffff805509bc>] dump_stack+0x10c/0x190
[ 50.093498] [<ffffffff8015dde0>] __warn+0xf0/0x108
[ 50.099889] [<ffffffff8015de34>] warn_slowpath_fmt+0x3c/0x48
[ 50.107241] [<ffffffff801c15b4>] check_flags.part.41+0x1dc/0x1e8
[ 50.114961] [<ffffffff801c239c>] lock_is_held_type+0x8c/0xb0
[ 50.122291] [<ffffffff809461b8>] __schedule+0x8c0/0x10f8
[ 50.129221] [<ffffffff80946a60>] schedule+0x30/0x98
[ 50.135659] [<ffffffff80106278>] work_resched+0x8/0x34
[ 50.142397] ---[ end trace 0cb4f6ef5b99fe21 ]---
[ 50.148405] possible reason: unannotated irqs-off.
[ 50.154600] irq event stamp: 400463
[ 50.159566] hardirqs last enabled at (400463): [<ffffffff8094edc8>] _raw_spin_unlock_irqrestore+0x40/0xa8
[ 50.171981] hardirqs last disabled at (400462): [<ffffffff8094eb98>] _raw_spin_lock_irqsave+0x30/0xb0
[ 50.183897] softirqs last enabled at (400450): [<ffffffff8016580c>] __do_softirq+0x4ac/0x6a8
[ 50.195015] softirqs last disabled at (400425): [<ffffffff80165e78>] irq_exit+0x110/0x128
Fix this by using the TRACE_IRQS_OFF macro to call trace_hardirqs_off()
when CONFIG_TRACE_IRQFLAGS is enabled. This is done before invoking
schedule() following the work_resched label because:
1) Interrupts are disabled regardless of the path we take to reach
work_resched() & schedule().
2) Performing the tracing here avoids the need to do it in paths which
disable interrupts but don't call out to C code before hitting a
path which uses the RESTORE_SOME macro that will call
trace_hardirqs_on() or trace_hardirqs_off() as appropriate.
We call trace_hardirqs_on() using the TRACE_IRQS_ON macro before calling
syscall_trace_leave() for similar reasons, ensuring that lockdep has a
consistent view of state after we re-enable interrupts.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/15385/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We allocate memory for a ready_count variable per-CPU, which is accessed
via a cached non-coherent TLB mapping to perform synchronisation between
threads within the core using LL/SC instructions. In order to ensure
that the variable is contained within its own data cache line we
allocate 2 lines worth of memory & align the resulting pointer to a line
boundary. This is however unnecessary, since kmalloc is guaranteed to
return memory which is at least cache-line aligned (see
ARCH_DMA_MINALIGN). Stop the redundant manual alignment.
Besides cleaning up the code & avoiding needless work, this has the side
effect of avoiding an arithmetic error found by Bryan on 64 bit systems
due to the 32 bit size of the former dlinesz. This led the ready_count
variable to have its upper 32b cleared erroneously for MIPS64 kernels,
causing problems when ready_count was later used on MIPS64 via cpuidle.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 3179d37ee1 ("MIPS: pm-cps: add PM state entry code for CPS systems")
Reported-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: stable <stable@vger.kernel.org> # v3.16+
Patchwork: https://patchwork.linux-mips.org/patch/15383/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Make thin archives build the default, but keep the config option
to allow exemptions if any breakage can't be quickly solved.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The linker does not like vdso-syms.lds in input archive files.
Make it an extra-y instead.
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: user-mode-linux-devel@lists.sourceforge.net
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The VDSO symbols can't be linked into built-in.o when building with
thin archives, so change this to linking a new object file that is
included into the built-in.o.
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The VDSO symbols can't be linked into built-in.o when building with
thin archives, so change this to linking a new object file that is
included into the built-in.o.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The VDSO symbols can't be linked into built-in.o when building with
thin archives, so change this to linking a new object file that is
included into the built-in.o.
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
All the files listed in "extra-y" are generated according to the
dependency. They are still needed in "targets" to include .*.cmd
for incremental building.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Build targets using if_changed(_rule) must depend on FORCE so that
they are evaluated every time.
In order to include .*.cmd files correctly, build targets added to
"targets" must not be prefixed with $(obj)/ because it is done by
scripts/Makefile.lib .
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Otherwise, depending upon link order, the branch relocation
limits could be exceeded.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The pmd containing memblock_limit is cleared by prepare_page_table()
which creates the opportunity for early_alloc() to allocate unmapped
memory if memblock_limit is not pmd aligned causing a boot-time hang.
Commit 965278dcb8 ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
attempted to resolve this problem, but there is a path through the
adjust_lowmem_bounds() routine where if all memory regions start and
end on pmd-aligned addresses the memblock_limit will be set to
arm_lowmem_limit.
Since arm_lowmem_limit can be affected by the vmalloc early parameter,
the value of arm_lowmem_limit may not be pmd-aligned. This commit
corrects this oversight such that memblock_limit is always rounded
down to pmd-alignment.
Fixes: 965278dcb8 ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
A recent commit moved most logic of early boot up from startup_64() written
in assembly to __startup_64() written in C.
Fengguang reported breakage due to the change. It was tracked down to
CONFIG_FUNCTION_TRACER being enabled.
Tracing this function is not possible because it's invoked from the
earliest boot stage before the relocation fixups have been done. It is the
function doing the relocation.
Exclude it from being built with tracer stubs.
Fixes: c88d71508e ("x86/boot/64: Rewrite startup_64() in C")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: lkp@01.org
Link: http://lkml.kernel.org/r/20170627115948.17938-1-kirill.shutemov@linux.intel.com
Should not init a NULL box. It will cause system crash.
The issue looks like caused by a typo.
This was not noticed because there is no NULL box. Also, for most
boxes, they are enabled by default. The init code is not critical.
Fixes: fff4b87e59 ("perf/x86/intel/uncore: Make package handling more robust")
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629190926.2456-1-kan.liang@intel.com
Now that compat_vfp_get() uses the regset API to copy the FPSCR
value out to userspace, compat_vfp_set() looks inconsistent. In
particular, compat_vfp_set() will fail if called with kbuf != NULL
&& ubuf == NULL (which is valid usage according to the regset API).
This patch fixes compat_vfp_set() to use user_regset_copyin(),
similarly to compat_vfp_get().
This also squashes a sparse warning triggered by the cast that
drops __user when calling get_user().
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
compat_vfp_set() checks for userspace trying to write an excessive
amount of data to the regset. However this check is conspicuous
for its absence from every other _set() in the arm64 ptrace
implementation. In fact, the core ptrace_regset() already clamps
userspace's iov_len to the regset size before the individual regset
.{get,set}() methods get called.
This patch removes the redundant check.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
If get_user() fails when reading the new FPSCR value from userspace
in compat_vfp_get(), then garbage* will be written to the task's
FPSR and FPCR registers.
This patch prevents this by checking the return from get_user()
first.
[*] Actually, zero, due to the behaviour of get_user() on error, but
that's still not what userspace expects.
Fixes: 478fcb2cdb ("arm64: Debugging support")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the TSC deadline timer is programmed really close to the deadline or
even in the past, the computation in vmx_set_hv_timer will program the
absolute target tsc value to vmcs preemption timer field w/ delta == 0,
then plays a vmentry and an upcoming vmx preemption timer fire vmexit
dance, the lapic timer injection is delayed due to this duration. Actually
the lapic timer which is emulated by hrtimer can handle this correctly.
This patch fixes it by firing the lapic timer and injecting a timer interrupt
immediately during the next vmentry if the TSC deadline timer is programmed
really close to the deadline or even in the past. This saves ~300 cycles on
the tsc_deadline_timer test of apic.flat.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the code to cancel the hv timer into the caller, just before
it starts the hrtimer. Check availability of the hv timer in
start_hv_timer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are many cases in which the hv timer must be canceled. Split out
a new function to avoid duplication.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds SMP code to bring up the remaining S500 CPU cores
by reusing a helper factored out of the SPS power domains driver.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZTUjlAAoJEPou0S0+fgE/yxgP/1J/t1eSTX9l+NkP6bxmsKNk
k4KKCwhch57Mw14iPTysOaezAn/2YNVLEYKSa0JNYOmrUA4Y2gyra0LhAejh7Jd8
Y3FGdYbbgRHTY0DLAK+vMVnRjHrUu7O+Xgngf6bNRdwL4+qBGTxHMvdoFaWRdZZv
6bjKAeDAkCKQ62xXTdT9EaKFSPKzLgYyT+7ey3JE5J6MlKVvzjxyG4HtFm9DYpH5
LvuQ1oZdtVs2Ils8lOF8Z8VRPaKCP4XhnYIqQ4fP9tvaF6lL4R0xYXtMWG4aRpXT
+g50M0+iE61BUyvCbBsT8eEUnasmmYxtI9eeAj9/HZ3vOem3zuZ4HxbUv+Ss39fS
39IwQvallLpykKRpykbCXfIyLxIO1XKQk/UQwwSB0nD3QETPmbJB8pbel7LH3VFW
/hdF7aDq8HlL5hvXp9PLC9/avNCkuZWrjhyj+qUZePpRaF6xi0VPDybpJLfoGfw2
OPf6JNq317Bw+OII7TNzYMPclCb4UU0+n1wLQyMaDRfc0Riplec8hUQlMTg3aTsZ
Oj+5g4+m19yahoQEWf6DyXRLk7JQEpbuKhmD2HzC5wJbsfU1COcx9WN2BZeRWGTS
AD++5WYXwaNdskbGt5UpFvqipWmWq6UVAMolid29f5jN1jUyEJ6ychq307kakhWU
nYJwn+KDIaICR9KmFy19
=VNDJ
-----END PGP SIGNATURE-----
Merge tag 'actions-arm-soc+sps-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/soc
Pull "Actions Semi ARM SoC for v4.13 #2" from Andreas Färber:
This adds SMP code to bring up the remaining S500 CPU cores
by reusing a helper factored out of the SPS power domains driver.
* tag 'actions-arm-soc+sps-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions:
ARM: owl: smp: Implement SPS power-gating for CPU2 and CPU3
soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating
soc: actions: Add Owl SPS
dt-bindings: power: Add Owl SPS power domains
get_alt_insn() is used to read and create ARM instructions, which
are always stored in memory in little-endian order. These values
are thus correctly converted to/from native order when processed
but the pointers used to hold the address of these instructions
are declared as for native order values.
Fix this by declaring the pointers as __le32* instead of u32* and
make the few appropriate needed changes like removing the unneeded
cast '(u32*)' in front of __ALT_PTR()'s definition.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In the flattened device tree format, all integer properties are
in big-endian order.
Here the property "kaslr-seed" is read from the fdt and then
correctly converted to native order (via fdt64_to_cpu()) but the
pointer used for this is not annotated as being for big-endian.
Fix this by declaring the pointer as fdt64_t instead of u64
(fdt64_t being itself typedefed to __be64).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
ARM64 implementation of ip_fast_csum() do most of the work
in 128 or 64 bit and call csum_fold() to finalize. csum_fold()
itself take a __wsum argument, to insure that this value is
always a 32bit native-order value.
Fix this by adding the sadly needed '__force' to cast the native
'sum' to the type '__wsum'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
A wide variety of TI platforms support NAND via the
CONFIG_MTD_NAND_OMAP2 driver (and related BCH options), so enable this.
In addition, multi_v7_defconfig supports the dm8168-evm and that
supports root being on a SATA drive, so build the DM816 AHCI driver into
the resulting kernel as well.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mihail Grigorov <michael.grigorov@konsulko.com>
Signed-off-by: Tom Rini <trini@konsulko.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This adds a Kconfig symbol for DTs and drivers being added.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZTUaJAAoJEPou0S0+fgE/5gcQAK70t5sC0XKtjJ96ZC1t+w/u
r5zQTlYQQYzS0B3NXLiC2ubJ/Z4b22V5a3ntT5r1JLH6ntvn/TlIYW2ov7cNCJsL
H5VgmA31skAB0xgnkVPx+j0mpdlzJvYKerbNXsJHG7JI/htHcmoXfwT7VKoCWqDJ
8T6zRAM6tjiF9Va0JYKL4VRtJAKoBxN9PjGUVUvRTjCp6PEXNpG/vmMhLonRNS2+
K1WFyKNUMROdrf5/3WPVUEgC9WWw6wtBs4G+p1UWADj8zPXGEXeK/ZPW8SfTCOA6
/lZvm/puXPjNZmXl/kVVZBaxFcl5aTIP96OAiCXpnEs15ZuDANGpyHd5g3BZTOfh
XFldHOqO6qlSKW3rdOeLjKtPakLYpVMGbkU9VZMvu/VvJbiCNFr1IXnnu4lELkUV
2Y+pCGXRUART6aRvwkx1HLgD/sgQyL9h16Zve6oW2oYA5UEQJ8A91PWsv7Qj9R1Z
EeTxT9sJ+7QPBwNmhyW1ozPeWILeDpP41bI6N2UcPUQeIICTYWXq6bOyjfnExFzU
FJut2UO1LDWY7Rpr41oxEIaFN50K0OaBLnTu7oUca9p+llhmiOnLMl1xJ7IpfxSI
zvwkI0agrDRaRX/80fLg5QGAdSKoSBgU5+BzHg2J8KLFDb/LmRugge2SRFlTVpjY
Ovd3HF/oqwoYoM3vCBSJ
=8Vxy
-----END PGP SIGNATURE-----
Merge tag 'actions-arm64-soc-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/arm64
Pull "Actions Semi ARM64 SoC for v4.13" from Andreas Färber:
This adds a Kconfig symbol for DTs and drivers being added.
* tag 'actions-arm64-soc-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions:
arm64: Prepare Actions Semi S900
This adds an initial DT for the S900 SoC and a devboard based on it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZTULJAAoJEPou0S0+fgE/00gP/3V47/jMZmIcRZymiRXUkZAc
cZiIkeKXa0LEXQPk5iTAVXCU7vfhWHHq3g9lA+CWMhbBKO+6K/RwipdscqAxLNfF
kaIIO4M1CEVB9R55JR2jTd6qEHBD+MibTxYc2bbfrWJtbaQJWZ+ep0oKqpOccoby
BRb5yhBt0b0K4GOlgOQHNVIc6sbdhH8ZJ+n/Dqj57eDxzH/u+DUaWdwAIHDaN8Qw
+qAM4HTYSXnlwjWgRgeBws7adDV8wDLbQwpNrmHjDBGxZ8BTIZxrnydKG7AUMIPi
o8A50c7khq4xFQN+IhhNrhwQpilMRjlZc84DwqBsFdaUzei4jYZmY8udxKyuJdhr
ZMerSnvoBOew/KpyvhxaT0aHD0Zlxzl6yKy7dw3FAbqnQTUUmpwp2kM1bsNteMwU
FD5YvLv04R8y9w8XJdsPATQiqu06dZIBZdHubI4D3hWMMYi963eH7KE9ON8mq5Uc
d7Ex1+q9DvjiwOlPtTAcLoeAQeVKeojrDQTTpH4GR3PtukJVBLzS1IiFl1qweele
G1i0WTUgSuYmlvoWOIX4jzeejiXab5WOUszdsIT5qx0LIZKnS/XfQe/kjOYcto6Z
3244hW7AXkdVGOieUX4d+DiizlvJe6VPbq5RJ6Ka5ZmWIdUbH5KkDT4F/jSoi4d4
Ny4hC2twmGfmUIFqbBpp
=WMGS
-----END PGP SIGNATURE-----
Merge tag 'actions-arm64-dt-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/dt64
Pull "Actions Semi ARM64 based SoC DT for 4.13" from Andreas Färber:
This adds an initial DT for the S900 SoC and a devboard based on it.
* tag 'actions-arm64-dt-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions:
arm64: dts: Add Actions Semi S900 and Bubblegum-96
dt-bindings: Add vendor prefix for uCRobotics
This adds an initial DT for the S500 SoC and a devboard based on it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZTUJnAAoJEPou0S0+fgE/iG8P+wdYY0r5HRpGgRKnMUEPzn2Y
xNGyjF9afyQ8YBv+9qWBqAFct4nxNckEm+0TIVCVQ4Pwuv/i4ovgx5bZ6ud2I9wY
jBIbgonuZx3ycLrUYiKLIvvkqbOYxkaoyU5Wxn/xc3G8VH+iF4NDJ4fok7BWlMk/
hJsuGmq6xC2FTSgQ4aXufLX0fWT3+Lblome9eoKOaN+NMh6iy8yENtPVPpSYIpIL
0Nx22FGZj2htUiabwuBYn2nDAYDW9/IkOPAklbZN0YtZ/jfCti1Ch77APqMuekvD
6pUJAT3/nqpRFPW5DGeuMmln+5yAcM1dar0B53lK+C91pQhk3L/o75YLJprhyxkq
MtbONE2PcPuE31zNU/yvd/xwOeLKPYPTmbV/ceOM2C5qeUghYN/278Io86oCA6Ru
szb3dONzjnFA8Oac2deuUvSOnmWqcH0ST0I/nBYXK9ZozAYyOGlq4lvVf80zayy+
tMpsq3vHLjpguCQQalhTwfix0JfgApgruHoupWgKPZuKFW2Sn/42kUOGOb9djoCx
aw4+OL4S/XWbL39CVkDts00VG0jXihjtNvY1pLVO0PYSatbamzDf+Yg0JKEFZ2PC
yeLkKDc0T9PjWmQBDgAo95e3oP7WgohciC/SEiXxPVAyxNtW4l7A5Lj1iVo0kv7X
E5hkIU6svr10sE/rGtbU
=Z5XX
-----END PGP SIGNATURE-----
Merge tag 'actions-arm-dt-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/dt
Pull "Actions Semi ARM based SoC DT for v4.13" from Andreas Färber:
This adds an initial DT for the S500 SoC and a devboard based on it.
* tag 'actions-arm-dt-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions:
ARM: dts: owl-s500: Add SPS node
ARM: dts: owl-s500: Set CPU enable-method
dt-bindings: arm: cpus: Add S500 enable-method
ARM: dts: Add Actions Semi S500 and LeMaker Guitar
dt-bindings: arm: Document Actions Semi S900
dt-bindings: timer: Document Owl timer
dt-bindings: arm: Document Actions Semi S500
dt-bindings: Add vendor prefix for Actions Semi
This adds a Kconfig symbol and mach-actions with board and SMP code,
plus a MAINTAINERS entry.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZTUHXAAoJEPou0S0+fgE/YUEP+wfgGxJeM4fUk7FCgIEBIaWH
sdIEUX2KokmsZjYOdo7nM5pa6FDQGT+4DSm6B1RoweKzrKWrTX0lqubcuCQpbr68
Zgh641gSS43NsYgWuoIMZL4+b4JknhbFCs1aPjItK2bLz3naLfBgqPvLxcllDFia
//t7qA+yWbVnDfyw/cnctOFnqwayugn9uwfm+GxPwnPzxkjxmXn9RKoNj314xxuf
Np/SnC2Q+q6P9EPrML9wvxwR/h5PQpVwf9jva4QjLZjxq9coMiNF1W1Ye7SCViyw
KD5qSrKg+ST26mR6AXHfTYrGdts1rTM8L9SOoypToZ7UnXu3ki9iHyGDS829DWCV
CtkX+Kgvd2hc7YMyVMiSYANWwq1RIjNrXn2YF+x9bYI+OwlYcUGAI4R0DKjI5FbE
uDLlegprvKZkxYzX7MHpbpaRSEpOAk75Q63hecS9kqtJtOoh5tsI8h4614G58alT
muuGJB9aWJD8Z3tuPJKrmorp2Ffg7lYk2JlWbNnS2wkGWpuSlWs1oc12nif/XsKF
BK6G8Bnt687T+Ddlfz7mtdz9iHCTZ3UASvt0kUdZcKfjq9zyWS9NY1x5NKsSy9lJ
WpTW0N4eV8Iu6pjsdONI5Wuwe7RUfKbZdFI7Tjn3p8d9MLJOF7lhaBSia4VPsxzg
afG9lAE2mGFl8KDsAlXw
=aQiU
-----END PGP SIGNATURE-----
Merge tag 'actions-arm-soc-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/soc
Pull "Actions Semi ARM SoC for v4.13" from Andreas Färber:
This adds a Kconfig symbol and mach-actions with board and SMP code,
plus a MAINTAINERS entry.
* tag 'actions-arm-soc-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions:
MAINTAINERS: Update Actions Semi section with SPS
ARM: owl: Implement CPU enable-method for S500
MAINTAINERS: Add Actions Semi Owl section
ARM: Prepare Actions Semi S500
* Enable RPMSG_QCOM_SMD to get boards working again
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZTV8NAAoJEFKiBbHx2RXVVssP/jtpXxUPpLs3KyH0Kvum/4jC
bG4sv6iW8afXYeHCTq0BDQIZx0HnhIzKbtQ0HNVdmxeCncmqRGFena4q0mDf+vCM
nvi2t5qi5Vvi+GAU4fgSJ4bKftBlEvmCXcmYcdb1famo4HfoKVllFRCIpw7udVM3
ofxe2EbbnEvjCm06L4fR1tnLe5hDgeV0yUhiHnzKM5ACS471dFjrN7N8i5ONjw2r
HA+p3Bcarst5iiClSXXuxFPCg4biCFj0YqtLMxGt9F8vGR4uMxI2Gvo51JFgnNrh
PkmHik//d7UUViTlUyz/qV+T4jwZR++X7JUcRhXEb6PlOnnBTXTyYlFxCWuB1ruq
Ptt40fxjiE9IM2zc/0q5NqACvwO8J+W9HqTP3hsvtHrDeznq7n1vsU1yz9v7eGjH
6cTl9/6hLJATa6e2bMSp+f1VTfnewDx18H31wpAbZwQBFzV62YsyysDMe2hxtEU1
BEdJS+q8riJsBU13pZe7q+GJH1Yhf0eVbs+ktSYXssPxiZXw+P+PRS1YHxcWEQ96
h5R7Myx2Vd3NbSN7G23G8Ylv64mz6OMJQ/kj0RvjykvACPck3C8I+FqGrwOzwFYg
HznJfFueLgliim5ib9giPIkToqOt6szf635JpktLchDGGHvlHeyrQ7HKrgpGNcWu
25BNZiG3S8kJ4rL4avpu
=wPTI
-----END PGP SIGNATURE-----
Merge tag 'qcom-defconfig-for-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into next/defconfig
Pull "Qualcomm ARM Based defconfig Updates for v4.13 - Part 2" from Andy Gross:
* Enable RPMSG_QCOM_SMD to get boards working again
* tag 'qcom-defconfig-for-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux:
ARM: qcom_defconfig: enable RPMSG_QCOM_SMD
- greatly expands DT clock support for meson8b
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZTDWYAAoJEFk3GJrT+8ZlGwsP/0K7Mb81/wkne6Yr7FOQ/PTm
QcPG9QGhj/2QyafiAEu5REeg/oraHxt/HCAC9uz7jM0CfrdYLD0imoMTOgqaIGZc
BtvdJthSRUZPc91ou6vP527MB46ipTtP/69Ed/hmkSALm9QAhs430HsXfmmU7KnS
M0dbQ1KOqQ4Hd/zup3D1LfUlsaOcuiiu73Ys3Xg5zBAq8QWJypLhZL7f9dvVYCQ+
lqEtsjUt/qNvky72oKsSZX6Eb2gHKtC38fOVeYq3Z5xatU2UNWlEBGxmfTyFQP+E
T3wBvCVgSPmqW0wD9yGdfqKtH7ZXJ1OBE/21pxjPLq1Rb45kC26jC9FMDOb0RS3w
jaPzinjCcJ0AR5wUojmy7xbaVBrAESaldtj32Fxz4LIe6qPUB8AAiUsTAzwve/Bk
17RP5Zf2BsBcmfsai+HYc6zGCwqC2pbmotuX3Sfw/CeE3sRhWqtpkixV9xsidM5f
oR7bBsAjAhzRJ+SvMiJ0kPa9iNQdCgE3C2quIns7F35jwuWMAckGe6BeJ6RbX1ZK
Mrzq4r7Q1utYldYfBgus/1Tf/RW+iwqGcyarhq6GAFcjx9siKF4lb5tXt237xwK2
vP+KTRAAqii3F0+sUkzKZV2/RpqRDslmCPuq2mQNRfCtm+Px97vl+saRduz2R2jd
vcKSfM8Vj2voxITL3nNz
=o8Zj
-----END PGP SIGNATURE-----
Merge tag 'amlogic-dt-2' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into next/dt
Merge "Amlogic 32-bit DT changes for v4.13 (round 2)" from Kevin Hilman:
- greatly expands DT clock support for meson8b
* tag 'amlogic-dt-2' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic: (22 commits)
ARM: dts: meson: use the real ethernet clock on Meson8 and Meson8b
ARM: dts: meson8b: add the SCU device node
ARM: dts: meson: add USB support on Meson8 and Meson8b
ARM: dts: meson: add the hardware random number generator
ARM: dts: meson8: add reserved memory zones
ARM: dts: meson: add the SAR ADC
ARM: dts: meson8: add the pins for the SDIO controller
ARM: dts: meson8: add the PWM_E and PWM_F pins
ARM: dts: meson: use GIC_SPI and IRQ_TYPE_EDGE_RISING macros
ARM: dts: meson: use C preprocessor friendly include syntax
ARM: dts: meson8: fix the IR receiver pins
clk: meson8b: export the ethernet gate clock
clk: meson8b: export the USB clocks
clk: meson8b: export the gate clock for the HW random number generator
clk: meson8b: export the SDIO clock
clk: meson8b: export the SAR ADC clocks
clk: meson-gxbb: un-export the CPU clock
clk: meson-gxbb: expose UART clocks
clk: meson-gxbb: expose SPICC gate
clk: meson-gxbb: expose spdif master clock
...
Here both variables 'cpu_id' and 'entry_point' are read via
read[lq]_relaxed(), from a little-endian annotated pointer
and then used as a native endian value.
This is correct since the read[lq]() family of function
internally do a little-to-native endian conversion.
But in this case, it is wrong to declare these variable as
little-endian since there are native ones.
Fix this by changing the declaration of these variables
as 'u32' or 'u64' instead of '__le32' / '__le64'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here the entrypoint, declared as a 64 bit integer, is read from
a pointer to 64bit integer but the read is done via readl_relaxed()
which is for 32bit quantities.
All the high bits will thus be lost which change the meaning
of the test against zero done later.
Fix this by using readq_relaxed() instead as it should be for
64bit quantities.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here the functions reloc_insn_movw() & reloc_insn_imm() are used
to read, modify and write back ARM instructions, which are always
stored in memory in little-endian order. These values are thus
correctly converted to/from native order but the pointers used to
hold their addresses are declared as for native order values.
Fix this by declaring the pointers as __le32* and remove the
casts that are now unneeded.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
aarch64_insn_write() is used to write an instruction.
As on ARM64 in-memory instructions are always stored
in little-endian order, this function, taking the instruction
opcode in native order, correctly convert it to little-endian
before sending it to an helper function __aarch64_insn_write()
which will do the effective write.
This is all good, but the variable and argument holding the
converted value are not annotated for a little-endian value
but left for native values.
Fix this by adjusting the prototype of the helper and
directly using the result of cpu_to_le32() without passing
by an intermediate variable (which was not a distinct one
but the same as the one holding the native value).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The function arch64_insn_read() is used to read an instruction.
On AM64 instructions are always stored in little-endian order
and thus the function correctly do a little-to-native endian
conversion to the value just read.
However, the variable used to hold the value before the conversion
is not declared for a little-endian value but for a native one.
Fix this by using the correct type for the declaration: __le32
Note: This only works because the function reading the value,
probe_kernel_read((), takes a void pointer and void pointers
are endian-agnostic. Otherwise probe_kernel_read() should
also be properly annotated (or worse, need to be specialized).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here we're reading thumb or ARM instructions, which are always
stored in memory in little-endian order. These values are thus
correctly converted to native order but the intermediate value
should be annotated as for little-endian values.
Fix this by declaring the intermediate var as __le32 or __le16.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Here we're reading thumb or ARM instructions, which are always
stored in memory in little-endian order. These values are thus
correctly converted to native order but the intermediate value
should be annotated as for little-endian values.
Fix this by declaring the intermediate var as __le32 or __le16.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The only user of thread_saved_pc() in non-arch-specific code was removed
in commit 8243d55977 ("sched/core: Remove pointless printout in
sched_show_task()"). Remove the implementations as well.
Some architectures use thread_saved_pc() in their arch-specific code.
Leave their thread_saved_pc() intact.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Neither soft poweroff (transition to ACPI power state S5) nor
suspend-to-RAM (transition to state S3) works on the Macbook Pro 11,4 and
11,5.
The problem is related to the [mem 0x7fa00000-0x7fbfffff] space. When we
use that space, e.g., by assigning it to the 00:1c.0 Root Port, the ACPI
Power Management 1 Control Register (PM1_CNT) at [io 0x1804] doesn't work
anymore.
Linux does a soft poweroff (transition to S5) by writing to PM1_CNT. The
theory about why this doesn't work is:
- The write to PM1_CNT causes an SMI
- The BIOS SMI handler depends on something in
[mem 0x7fa00000-0x7fbfffff]
- When Linux assigns [mem 0x7fa00000-0x7fbfffff] to the 00:1c.0 Port, it
covers up whatever the SMI handler uses, so the SMI handler no longer
works correctly
Reserve the [mem 0x7fa00000-0x7fbfffff] space so we don't assign it to
anything.
This is voodoo programming, since we don't know what the real conflict is,
but we've failed to find the root cause.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=103211
Tested-by: thejoe@gmail.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Chen Yu <yu.c.chen@intel.com>
- initial machine check forwarding
- migration support for the CMMA page hinting information
- cleanups
- fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJZU5JdAAoJEBF7vIC1phx8gCwP/RTl1DzLsyuSbX/AhneQVb/X
gXRnrtVEMsya4vL5lZxbp8JD5J4nBu8vNlgDmQwXM1KiFVDW5IFyQLUHv5PP899z
357mQC61pbkuDA8BhM71FuQav2V0ZMes+FYsza4Zx+Iev4uQtVfTos/nuMPnRVaD
hSfWKbQ9dH/Yluxn8ClXkUOrLH7luiU7HZoQLTxYPFmyM9BIgSbUH2rSXUbQ/i5I
PLpcky6M52/A/IFeEAt5qASsCwWJhPSLGsLKghDKvHDcBWVSb/M94ypXKInZ0pTf
l97TOwCHVODje0Nn4R7wuoeY1ahOwgfhbI3R8m9Cnck3t7mbWtzYVn3DvSXl/Juk
3dfMkbi/GG9lrHoOwnGVGUsaNw5U11sDZEV+rVDT5847HEnGclNWfIBzr4Lcchdr
7f3qap9AGLWu79e32mOP2yO2zFKXpDdVuFfW/c/ms4wq3v03a6HxcUkIn98m6Q1O
EEKzwknA1tSCdtWKOW9THENmywd1o4pMisC+FHnBxFwllOl5ORpbPegOrPCe7qQW
+MZClAJl0s23NpbEMzwrilHzC1P9RxYTFnhGmVamcAg9PVOcFIOGllum26IXzaFM
SyJ8HxS10SiAIVzv18yw3uxy6BUzzuKulIPu+W7JeOTOAAWiwTNL8wEx1ol93Ioi
531QgI7kPfDnudS14WaM
=L7Ia
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: fixes and features for 4.13
- initial machine check forwarding
- migration support for the CMMA page hinting information
- cleanups
- fixes
The memory operand fetched for INVVPID is 128 bits. Bits 63:16 are
reserved and must be zero. Otherwise, the instruction fails with
VMfail(Invalid operand to INVEPT/INVVPID). If the INVVPID_TYPE is 0
(individual address invalidation), then bits 127:64 must be in
canonical form, or the instruction fails with VMfail(Invalid operand
to INVEPT/INVVPID).
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All x86 PCI configuration space accessors have either their own
serialization or can operate completely lockless (ECAM).
Disable the global lock in the generic PCI configuration space accessors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <helgaas@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/20170316215057.295079391@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
x86 wants to get rid of the global pci_lock protecting the config space
accessors so ECAM mode can operate completely lockless, but the CE4100 PCI
code relies on that to protect the simulation registers.
Restructure the code so it uses the x86 specific pci_config_lock to
serialize the inner workings of the CE4100 PCI magic. That allows to remove
the global locking via pci_lock later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <helgaas@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/20170316215057.126873574@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If the legacy PCI init fails, then there are no PCI config space accesors
available, but the code continues and tries to scan the busses, which fails
due to the lack of config space accessors.
Return right away, if the last init fallback fails.
Switch the few printks to pr_info while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <helgaas@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/20170316215057.047576516@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
For some historic reason these defines are duplicated and also available in
arch/x86/include/asm/pci_x86.h,
Remove them.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <helgaas@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/20170316215056.967808646@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The introduction of pci_scan_root_bus_bridge() provides a PCI core API to
scan a PCI root bus backed by an already initialized struct pci_host_bridge
object, which simplifies the bus scan interface and makes the PCI scan root
bus interface easier to generalize as members are added to the struct
pci_host_bridge.
Convert ARM bios32 code to pci_scan_root_bus_bridge() to improve the PCI
root bus scanning interface.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
[bhelgaas: fold in warning fix from Arnd Bergmann <arnd@arndb.de>:
http://lkml.kernel.org/r/20170621215323.3921382-1-arnd@arndb.de]
[bhelgaas: set bridge->ops for mv78xx0]
[bhelgaas: fold in fixes from Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>:
http://lkml.kernel.org/r/20170701135457.GB8977@red-moon]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Andrew Lunn <andrew@lunn.ch>
Many subsystems will not use refcount_t unless there is a way to build the
kernel so that there is no regression in speed compared to atomic_t. This
adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation
which has the validation but is slightly slower. When not enabled,
refcount_t uses the basic unchecked atomic_t routines, which results in
no code changes compared to just using atomic_t directly.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Jann Horn <jannh@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: arozansk@redhat.com
Cc: axboe@kernel.dk
Cc: linux-arch <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170621200026.GA115679@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
By the time cell_pci_dma_dev_setup calls cell_dma_dev_setup no device can
have the fixed map_ops set yet as it's only set by the set_dma_mask
method. So move the setup for the fixed case to be only called in that
place instead of indirecting through cell_dma_dev_setup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
And instead wire it up as method for all the dma_map_ops instances.
Note that this also means the arch specific check will be fully instead
of partially applied in the AMD iommu driver.
Signed-off-by: Christoph Hellwig <hch@lst.de>
And instead wire it up as method for all the dma_map_ops instances.
Note that the code seems a little fishy for dmabounce and iommu, but
for now I'd like to preserve the existing behavior 1:1.
Signed-off-by: Christoph Hellwig <hch@lst.de>
This implementation is simply bogus - openrisc only has a simple
direct mapped DMA implementation and thus doesn't care about the
address.
Signed-off-by: Christoph Hellwig <hch@lst.de>
This implementation is simply bogus - hexagon only has a simple
direct mapped DMA implementation and thus doesn't care about the
address.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Usually dma_supported decisions are done by the dma_map_ops instance.
Switch sparc to that model by providing a ->dma_supported instance for
sbus that always returns false, and implementations tailored to the sun4u
and sun4v cases for sparc64, and leave it unimplemented for PCI on
sparc32, which means always supported.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
DMA_ERROR_CODE is going to go away, so don't rely on it. Instead
define a ->mapping_error method for all IOMMU based dma operation
instances. The direct ops don't ever return an error and don't
need a ->mapping_error method.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
s390 can also use noop_dma_ops, and while that currently does not return
errors it will so in the future. Implementing the mapping_error method
is the proper way to have per-ops error conditions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
pnv_wakeup_noloss() expects r12 to contain SRR1 value to determine if the wakeup
reason is an HMI in CHECK_HMI_INTERRUPT.
When we wakeup with ESL=0, SRR1 will not contain the wakeup reason, so there is
no point setting r12 to SRR1.
However, we don't set r12 at all so r12 contains garbage (likely a kernel
pointer), and is still used to check HMI assuming that it contained SRR1. This
causes the OPAL msglog to be filled with the following print:
HMI: Received HMI interrupt: HMER = 0x0040000000000000
This patch clears r12 after waking up from stop with ESL=EC=0, so that we don't
accidentally enter the HMI handler in pnv_wakeup_noloss() if the value of
r12[42:45] corresponds to HMI as wakeup reason.
Prior to commit 9d29250136 ("powerpc/64s/idle: Avoid SRR usage in idle
sleep/wake paths") this bug existed, in that we would incorrectly look at SRR1
to check for a HMI when SRR1 didn't contain a wakeup reason. However the SRR1
value would just happen to never have bits 42:45 set.
Fixes: 9d29250136 ("powerpc/64s/idle: Avoid SRR usage in idle sleep/wake paths")
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Change log and comment massaging]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This provides the basic plumbing for handling machine checks when
running guests
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJZU4QPAAoJEBF7vIC1phx8GZsP/2P4nxWXBj0NS/dNq54/u7HU
Va/zHIG7nUX81WZi8OCkPRlvb1RlcgNpIdw3Ar+BueFE6/qwVWBSdstVJCg6JSn4
L8T1srSeV6yQEPq1/I9S8ERYtbC8bOC3dDF6g+KyaKYnICjq5yC01+86MKSVfLTI
vFMPWY/PPCgECtXHjGpWBW6HjofRH3/H+XQbxaoTUyHKwWKdtvWer9K2V7Mc/Cf8
XsyLY2Xq0Y5MBsJs+71Qw8+0R041Et5I3H7Od9lIc3SFYNoenQpk5oTtsujMtDG1
ccMPZKErYI4wHE3Hy1ozK+MdFNbepUk3RBI3oXU25tpFPG3OPuksnOqCVN/iZmm+
le9RuUi9WOOsuygPj2dsnx5v+aheedEcYWqvQ/qrNlP3pXNcpl+8waM6eke8HyCK
1JKcqqGKBNX5wKNE9b5sRTHINWK12EVCQyVrgLlZaXoXLa40NpJPjtV27vr3ttVl
WmGYgwMUTo15Rdr0NSJlXl8iCgIFtWMHvuRhIgp8pBuWWb28zr6aX4w++JPwOOMZ
e4rzn55giCBDnjjDFQK2Knv5XxwnMKafYMxZXfC8gLr5ELjnI6vZDN+1zhT5L2S9
uXd8l6rLN2qik57RzPV6YEDS0iybZnx5HF/ZPrNoFigJpdD7/0jFS5K5N0i+AhV5
UQmGhSGnI7Teguc45mHT
=CTzL
-----END PGP SIGNATURE-----
Merge tag 'nmiforkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into features
Pull kvm patches from Christian Borntraeger:
"s390,kvm: provide plumbing for machines checks when running guests"
This provides the basic plumbing for handling machine checks when
running guests
With vsie feature enabled, kvm can support nested guests (guest-3).
So inject machine check to the guest-2 if it happens when the nested
guest is running. And guest-2 will detect the machine check belongs
to guest-3 and reinject it into guest-3.
The host (guest-1) tries to inject the machine check to the picked
destination vcpu if it's a floating machine check.
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If the exit flag of SIE indicates that a machine check has happened
during guest's running and needs to be injected, inject it to the guest
accordingly.
But some machine checks, e.g. Channel Report Pending (CRW), refer to
host conditions only (the guest's channel devices are not managed by
the kernel directly) and are therefore not injected into the guest.
External Damage (ED) is also not reinjected into the guest because ETR
conditions are gone in Linux and STP conditions are not enabled in the
guest, and ED contains only these 8 ETR and STP conditions.
In general, instruction-processing damage, system recovery,
storage error, service-processor damage and channel subsystem damage
will be reinjected into the guest, and the remain (System damage,
timing-facility damage, warning, ED and CRW) will be handled on the host.
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This provides the basic plumbing for handling machine checks when
running guests
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJZU4QPAAoJEBF7vIC1phx8GZsP/2P4nxWXBj0NS/dNq54/u7HU
Va/zHIG7nUX81WZi8OCkPRlvb1RlcgNpIdw3Ar+BueFE6/qwVWBSdstVJCg6JSn4
L8T1srSeV6yQEPq1/I9S8ERYtbC8bOC3dDF6g+KyaKYnICjq5yC01+86MKSVfLTI
vFMPWY/PPCgECtXHjGpWBW6HjofRH3/H+XQbxaoTUyHKwWKdtvWer9K2V7Mc/Cf8
XsyLY2Xq0Y5MBsJs+71Qw8+0R041Et5I3H7Od9lIc3SFYNoenQpk5oTtsujMtDG1
ccMPZKErYI4wHE3Hy1ozK+MdFNbepUk3RBI3oXU25tpFPG3OPuksnOqCVN/iZmm+
le9RuUi9WOOsuygPj2dsnx5v+aheedEcYWqvQ/qrNlP3pXNcpl+8waM6eke8HyCK
1JKcqqGKBNX5wKNE9b5sRTHINWK12EVCQyVrgLlZaXoXLa40NpJPjtV27vr3ttVl
WmGYgwMUTo15Rdr0NSJlXl8iCgIFtWMHvuRhIgp8pBuWWb28zr6aX4w++JPwOOMZ
e4rzn55giCBDnjjDFQK2Knv5XxwnMKafYMxZXfC8gLr5ELjnI6vZDN+1zhT5L2S9
uXd8l6rLN2qik57RzPV6YEDS0iybZnx5HF/ZPrNoFigJpdD7/0jFS5K5N0i+AhV5
UQmGhSGnI7Teguc45mHT
=CTzL
-----END PGP SIGNATURE-----
Merge tag 'nmiforkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kernelorgnext
s390,kvm: provide plumbing for machines checks when running guests
This provides the basic plumbing for handling machine checks when
running guests
When uid checking is enabled firmware guarantees uniqueness of the uids
and we use them for device enumeration. Tests have shown that uid checking
can be toggled at runtime. This is unfortunate since it can lead to name
clashes.
Recognize these name clashes by allocating bits in zpci_domain even for
firmware provided ids.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add some debug data to observe the lifetime of the
architecture specific device information.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In contrast to other hotplug events PEC 0x306 isn't about a single
but multiple devices. Also there's no information on what happened
to these devices. We correctly handled hotplug that way but failed
to handle hot-unplug. This patch addresses that and implements
hot-unplug of multiple devices via PEC 306.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
PCI hotplug events basically notify about the new state of a
function. Unfortunately some hypervisors implement hotplug
events in a way where it is not clear what the new state of
the function should be.
Use clp_get_state to find the current state of the function
and handle accordingly.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Code handling pci hotplug needs to determine the configuration
state of a pci function. Implement clp_get_state as a wrapper
for list pci functions.
Also change enum zpci_state to match the configuration state
values.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cleanup in zpci_fmb_enable_device when fmb registration fails. Also
don't free the fmb when deregistration fails in zpci_fmb_disable_device
but handle error situations when a function was hot-unplugged.
Also remove the mod_pci helper since it is no longer used.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
DMA tables are freed in zpci_dma_exit_device regardless of the return
code of zpci_unregister_ioat. This could lead to a use after free. On
the other hand during function hot-unplug, zpci_unregister_ioat will
always fail since the function is already gone.
So let zpci_unregister_ioat report success when the function is gone
but don't cleanup the dma table when a function could still have it
in access.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When we ask a function to stop creating interrupts this may fail
due to the function being already gone (e.g. after hot-unplug).
Consequently we don't free associated resources like summary bits
and bit vectors used for irq processing. This could lead to
situations where we ran out of these resources and fail to setup
new interrupts.
The fix is to just ignore the errors in cases where we can be
sure no new interrupts are generated.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After failures in arch_setup_msi_irqs common code calls
arch_teardown_msi_irqs. Thus, remove cleanup code from
arch_setup_msi_irqs.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Adds some explaination on how the vmemmap based struct page layout's
physical mapping is allocated and tracked through linked list. It
also keeps note of a possible race condition.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add some explaination to the layout of vmemmap virtual address
space and how physical page mapping is only used for valid PFNs
present at any point on the system.
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
nr_cpu_ids can be limited by nr_cpus boot parameter, whereas NR_CPUS is a
compile time constant, which shouldn't be compared against during cpu kick.
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
During secondary start, we do not need to BUG_ON if an invalid CPU number
is passed. We already print an error if secondary cannot be started, so
just return an error instead.
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.
But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.
So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.
Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.
But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.
So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.
Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.
But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.
So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.
Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.
But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.
So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.
Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The at24 driver allows to register I2C EEPROM chips using different vendor
and devices, but the I2C subsystem does not take the vendor into account
when matching using the I2C table since it only has device entries.
But when matching using an OF table, both the vendor and device has to be
taken into account so the driver defines only a set of compatible strings
using the "atmel" vendor as a generic fallback for compatible I2C devices.
So add this generic fallback to the device node compatible string to make
the device to match the driver using the OF device ID table.
Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Around 95% of memory is reserved by fadump/capture kernel. All this
memory is freed, one page at a time, on writing '1' to the node
/sys/kernel/fadump_release_mem. On systems with large memory, this
can take a long time to complete, leading to soft lockup warning
messages. To avoid this, add reschedule points at regular intervals.
Also, while memblock_reserve() implicitly takes care of holes in the
given memory range while reserving memory, those holes need to be
taken care of while releasing memory as memory is freed one page at
a time. Add support to skip holes while releasing memory.
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
fadump fails to register when there are holes in boot memory area.
Provide a helpful error message to the user in such case.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
To register fadump, boot memory area - the size of low memory chunk that
is required for a kernel to boot successfully when booted with restricted
memory, is assumed to have no holes. But this memory area is currently
not protected from hot-remove operations. So, fadump could fail to
re-register after a memory hot-remove operation, if memory is removed
from boot memory area. To avoid this, ensure that memory from boot
memory area is not hot-removed when fadump is registered.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Reviewed-by: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
fadump sets up crash memory ranges to be used for creating PT_LOAD
program headers in elfcore header. Memory chunk RMA_START through
boot memory area size is added as the first memory range because
firmware, at the time of crash, moves this memory chunk to different
location specified during fadump registration making it necessary to
create a separate program header for it with the correct offset.
This memory chunk is skipped while setting up the remaining memory
ranges. But currently, there is possibility that some of this memory
may have duplicate entries like when it is hot-removed and added
again. Ensure that no two memory ranges represent the same memory.
When 5 lmbs are hot-removed and then hot-plugged before registering
fadump, here is how the program headers in /proc/vmcore exported by
fadump look like
without this change:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000010000 0x0000000000000000 0x0000000000000000
0x0000000000001894 0x0000000000001894 0
LOAD 0x0000000000021020 0xc000000000000000 0x0000000000000000
0x0000000040000000 0x0000000040000000 RWE 0
LOAD 0x0000000040031020 0xc000000000000000 0x0000000000000000
0x0000000010000000 0x0000000010000000 RWE 0
LOAD 0x0000000050040000 0xc000000010000000 0x0000000010000000
0x0000000050000000 0x0000000050000000 RWE 0
LOAD 0x00000000a0040000 0xc000000060000000 0x0000000060000000
0x000000019ffe0000 0x000000019ffe0000 RWE 0
and with this change:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
NOTE 0x0000000000010000 0x0000000000000000 0x0000000000000000
0x0000000000001894 0x0000000000001894 0
LOAD 0x0000000000021020 0xc000000000000000 0x0000000000000000
0x0000000040000000 0x0000000040000000 RWE 0
LOAD 0x0000000040030000 0xc000000040000000 0x0000000040000000
0x0000000020000000 0x0000000020000000 RWE 0
LOAD 0x0000000060030000 0xc000000060000000 0x0000000060000000
0x000000019ffe0000 0x000000019ffe0000 RWE 0
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Reviewed-by: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Correct "branch" event code of Power9 is "r4d05e". Replace the current
"branch" event code with "r4d05e" and add a hack to use "r10012" as
event code for Power9 DD1.
Fixes: d89f473ff6 ("powerpc/perf: Fix PM_BRU_CMPL event code for power9")
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There is no reason for that message to be pr_info(), it will be printed
every time we start a KVM guest.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Use memdup_user() helper instead of open-coding to simplify the code.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Require all dax-drivers to register a ->copy_from_iter() operation so
that it is clear which dax_operations are optional and which must be
implemented for filesystem-dax to operate.
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that all callers of the pmem api have been converted to dax helpers that
call back to the pmem driver, we can remove include/linux/pmem.h and
asm/pmem.h.
Cc: <x86@kernel.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Kill this globally defined wrapper and move to libnvdimm so that we can
ultimately remove include/linux/pmem.h and asm/pmem.h.
Cc: <x86@kernel.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In this sequence the 'move' is assumed in the delay slot of the 'beq',
but head.S is in reorder mode and the former gets pushed one 'nop'
farther by the assembler.
The corrected behavior made booting with an UHI supplied dtb erratic.
Fixes: 15f37e1588 ("MIPS: store the appended dtb address in a variable")
Signed-off-by: Karl Beldan <karl.beldan+oss@gmail.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Jonas Gorski <jogo@openwrt.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16614/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull ARM fixes from Russell King:
"Three more fixes:
- Fix the previous fix merged in the last pull for the Thumb2
decompressor.
- A fix from Vladimir to correctly identify the V7M cache type.
- The optimised 3G vmsplit case does not work with LPAE, so don't
allow this to be selected for LPAE configurations"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8682/1: V7M: Set cacheid iff DminLine or IminLine is nonzero
ARM: 8681/1: make VMSPLIT_3G_OPT depends on !ARM_LPAE
ARM: 8680/1: boot/compressed: fix inappropriate Thumb2 mnemonic for __nop
I noticed that there's only one user of ftrace_arch_read_dyn_info().
That was used a while ago during the NMI updating in x86, and superh
copied it to implement its version of handling NMIs during
stop_machine().
But that is a debug feature, and this code hasn't been touched since
2009. Also, x86 no longer does the ftrace updates with stop_machine()
and instead uses breakpoints. If superh needs to modify its code, it
should implement the breakpoint conversion, and remove stop_machine().
Which also gets rid of the NMI issue.
Anyway, I want to nuke ftrace_arch_read_dyn_info() and this gets rid of
the one user, which is for an arch that shouldn't need it anymore.
Link: http://lkml.kernel.org/r/20170626181749.2ce954d1@gandalf.local.home
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add ptwrite to the op code map and the perf tools new instructions test.
To run the test:
$ tools/perf/perf test "x86 ins"
39: Test x86 instruction decoder - new instructions : Ok
Or to see the details:
$ tools/perf/perf test -v "x86 ins" 2>&1 | grep ptwrite
For information about ptwrite, refer the Intel SDM.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: http://lkml.kernel.org/r/1495180230-19367-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
enable_nmi_window is supposed to be a no-op if we know that we'll see
a VM exit by the time the NMI window opens. This commit adds two more
cases:
* We intercept stgi so we don't need to singlestep on GIF=0.
* We emulate nested vmexit so we don't need to singlestep when nested
VM exit is required.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Singlestepping is enabled by setting the TF flag and care must be
taken to not let the guest see (and reuse at an inconvenient time)
the modified rflag value. One such case is event injection, as part
of which flags are pushed on the stack and restored later on iret.
This commit disables singlestepping when we're about to inject an
event and forces an immediate exit for us to re-evaluate the NMI
related state.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These flags are used internally by SVM so it's cleaner to not leak
them to callers of svm_get_rflags. This is similar to how the TF
flag is handled on KVM_GUESTDBG_SINGLESTEP by kvm_get_rflags and
kvm_set_rflags.
Without this change, the flags may propagate from host VMCB to nested
VMCB or vice versa while singlestepping over a nested VM enter/exit,
and then get stuck in inappropriate places.
Example: NMI singlestepping is enabled while running L1 guest. The
instruction to step over is VMRUN and nested vmrun emulation stashes
rflags to hsave->save.rflags. Then if singlestepping is disabled
while still in L2, TF/RF will be cleared from the nested VMCB but the
next nested VM exit will restore them from hsave->save.rflags and
cause an unexpected DB exception.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nested hypervisor should not see singlestep VM exits if singlestepping
was enabled internally by KVM. Windows is particularly sensitive to this
and known to bluescreen on unexpected VM exits.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just moving the code to a new helper in preparation for following
commits.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a machine check happens in the guest, related mcck info (mcic,
external damage code, ...) is stored in the vcpu's lowcore on the host.
Then the machine check handler's low-level part is executed, followed
by the high-level part.
If the high-level part's execution is interrupted by a new machine check
happening on the same vcpu on the host, the mcck info in the lowcore is
overwritten with the new machine check's data.
If the high-level part's execution is scheduled to a different cpu,
the mcck info in the lowcore is uncertain.
Therefore, for both cases, the further reinjection to the guest will use
the wrong data.
Let's backup the mcck info in the lowcore to the sie page
for further reinjection, so that the right data will be used.
Add new member into struct sie_page to store related machine check's
info of mcic, failing storage address and external damage code.
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Add the logic to check if the machine check happens when the guest is
running. If yes, set the exit reason -EINTR in the machine check's
interrupt handler. Refactor s390_do_machine_check to avoid panicing
the host for some kinds of machine checks which happen
when guest is running.
Reinject the instruction processing damage's machine checks including
Delayed Access Exception instead of damaging the host if it happens
in the guest because it could be caused by improper update on TLB entry
or other software case and impacts the guest only.
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
I didn't find any use of this macro in the current kernel tree (with git
grep). KTHREAD_SIZE is no longer used for a very very long time. So
let's remove this definition.
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
On POWER9 the ERAT may be incorrect on wakeup from some stop states
that lose state. This causes random segvs and illegal instructions
when these stop states are enabled.
This patch invalidates the ERAT on wakeup on POWER9 to prevent this
from causing a problem.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Merge comment change with upstream changes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
From: Michael Neuling <mikey@neuling.org>
On P9 (Nimbus) DD2 and later, in radix mode, the move to the PID
register will implicitly invalidate the user space ERAT entries
and leave the kernel ones alone. Thus the only thing needed is
an isync() to synchronize this with subsequent uaccess's
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On PHB3/POWER8 systems, devices can select between two different sections
of address space, TVE#0 and TVE#1. TVE#0 is intended for 32bit devices
that aren't capable of addressing more than 4GB. Selecting TVE#1 instead,
with the capability of addressing over 4GB, is performed by setting bit 59
of a PCI address.
However, some devices aren't capable of addressing at least 59 bits, but
still want more than 4GB of DMA space. In order to enable this, reconfigure
TVE#0 to be suitable for 64-bit devices by allocating memory past the
initial 4GB that is inaccessible by 64-bit DMAs.
This bypass mode is only enabled if a device requests 4GB or more of DMA
address space, if the system has PHB3 (POWER8 systems), and if the device
does not share a PE with any devices from different vendors.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add a helper that determines if all the devices contained in a given PE
are all from the same vendor or not. This can be useful in determining
if it's okay to make PE-wide changes that may be suitable for some
devices but not for others.
This is used later in the series.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
As with P7IOC and PHB3, add kernel-side support for decoding and printing
diagnostic data for PHB4.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Diagnostic data for PHBs currently works by allocated a fixed-sized buffer.
This is simple, but either wastes memory (though only a few kilobytes) or
in the case of PHB4 isn't enough to fit the whole data blob.
For machines that don't describe the diagnostic data size in the device
tree, use the hardcoded buffer size as before. For those that do, only
allocate exactly what's needed.
In the special case of P7IOC (which has two types of diag data), the larger
should be specified in the device tree.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Dumping the PE State Tables (PEST) can be highly verbose if a number of PEs
are affected, especially in the case where the whole PHB is frozen and 512
lines get printed. Check for duplicates when dumping the PEST to reduce
useless output.
For example:
PE[0f8] A/B: 9700002600000000 80000080d00000f8
PE[0f9] A/B: 8000000000000000 0000000000000000
PE[..0fe] A/B: as above
PE[0ff] A/B: 8440002b00000000 0000000000000000
instead of:
PE[0f8] A/B: 9700002600000000 80000080d00000f8
PE[0f9] A/B: 8000000000000000 0000000000000000
PE[0fa] A/B: 8000000000000000 0000000000000000
PE[0fb] A/B: 8000000000000000 0000000000000000
PE[0fc] A/B: 8000000000000000 0000000000000000
PE[0fd] A/B: 8000000000000000 0000000000000000
PE[0fe] A/B: 8000000000000000 0000000000000000
PE[0ff] A/B: 8440002b00000000 0000000000000000
and you can imagine how much worse it can get for 512 PEs.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The asm code assumes the FP regs are at the start of fp_state. While
this is true now, it may not always be the case and there is nothing
enforcing it.
This fixes the asm-offsets to point to the actual FP registers inside
the fp_state. Similarly for VMX.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The P9 PVR bits 12-15 don't indicate a revision but instead different
chip configurations. From BookIV we have:
Bits Configuration
0 : Scale out 12 cores
1 : Scale out 24 cores
2 : Scale up 12 cores
3 : Scale up 24 cores
DD1 doesn't use this but DD2 does. Linux will mostly use the "Scale
out 24 core" configuration (ie. SMT4 not SMT8) which results in a PVR
of 0x004e1200. The reported revision in /proc/cpuinfo is hence
reported incorrectly as "18.0".
This patch fixes this to mask off only the relevant bits for the major
revision (ie. bits 8-11) for POWER9.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
AMD systems support the Monitor/Mwait instructions and these can be used
for ACPI C1 in the same way as on Intel systems.
Three things are needed:
1) This patch.
2) BIOS that declares a C1 state in _CST to use FFH, with correct values.
3) CPUID_Fn00000005_EDX is non-zero on the system.
The BIOS on AMD systems have historically not defined a C1 state in _CST,
so the acpi_idle driver uses HALT for ACPI C1.
Currently released systems have CPUID_Fn00000005_EDX as reserved/RAZ. If a
BIOS is released for these systems that requests a C1 state with FFH, the
FFH implementation in Linux will fail since CPUID_Fn00000005_EDX is 0. The
acpi_idle driver will then fallback to using HALT for ACPI C1.
Future systems are expected to have non-zero CPUID_Fn00000005_EDX and BIOS
support for using FFH for ACPI C1.
Allow ffh_cstate_init() to succeed on AMD systems.
Tested on Fam15h and Fam17h systems.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The goal of this change is to give users a uniform and meaningful
result when they read /sys/...cpufreq/scaling_cur_freq
on modern x86 hardware, as compared to what they get today.
Modern x86 processors include the hardware needed
to accurately calculate frequency over an interval --
APERF, MPERF, and the TSC.
Here we provide an x86 routine to make this calculation
on supported hardware, and use it in preference to any
driver driver-specific cpufreq_driver.get() routine.
MHz is computed like so:
MHz = base_MHz * delta_APERF / delta_MPERF
MHz is the average frequency of the busy processor
over a measurement interval. The interval is
defined to be the time between successive invocations
of aperfmperf_khz_on_cpu(), which are expected to to
happen on-demand when users read sysfs attribute
cpufreq/scaling_cur_freq.
As with previous methods of calculating MHz,
idle time is excluded.
base_MHz above is from TSC calibration global "cpu_khz".
This x86 native method to calculate MHz returns a meaningful result
no matter if P-states are controlled by hardware or firmware
and/or if the Linux cpufreq sub-system is or is-not installed.
When this routine is invoked more frequently, the measurement
interval becomes shorter. However, the code limits re-computation
to 10ms intervals so that average frequency remains meaningful.
Discerning users are encouraged to take advantage of
the turbostat(8) utility, which can gracefully handle
concurrent measurement intervals of arbitrary length.
Signed-off-by: Len Brown <len.brown@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull s390 bugfix from Martin Schwidefsky:
"One last s390 patch for 4.12
Revert the re-IPL semantics back to the v4.7 state. It turned out that
the memory layout may change due to memory hotplug if load-normal is
used"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL
The MCE severity gives a hint as to how to handle the error. The
notifier blocks can then use the severity to decide on an action.
It's not necessary for machine_check_poll() to filter errors for
the notifier chain, since each block will check its own set of
conditions before handling an error.
Also, there isn't any urgency for machine_check_poll() to make decisions
based on severity like in do_machine_check().
If we can assume that a severity is set then we can use it in more
notifier blocks. For example, the CEC block could check for a "KEEP"
severity rather than checking bits in the status. This isn't possible
now since the severity is not set except for "DEFFRRED/UCNA" errors with
a valid address.
Save the severity since we have it, and let the notifier blocks decide
if they want to do anything.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1498074402-98633-1-git-send-email-Yazen.Ghannam@amd.com
The helper function __load_ucode_amd() and pointer intel_ucode_patch do
not need to be in global scope, so make them static.
Fixes those sparse warnings:
"symbol '__load_ucode_amd' was not declared. Should it be static?"
"symbol 'intel_ucode_patch' was not declared. Should it be static?"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170622095736.11937-1-colin.king@canonical.com
Larry Finger reported that his Powerbook G4 was no longer booting with v4.12-rc,
userspace was up but giving weird errors such as:
udevd[64]: starting version 175
udevd[64]: Unable to receive ctrl message: Bad address.
modprobe: chdir(4.12-rc1): No such file or directory
He bisected the problem to commit 3448890c32 ("powerpc: get rid of zeroing,
switch to RAW_COPY_USER").
Al identified that the problem is actually a miscompilation by GCC 4.6.3, which
is exposed by the above commit.
Al also pointed out that inlining copy_to/from_user() is probably of little or
no benefit, which is correct. Using Anton's copy_to_user benchmark, with a
pathological single byte copy, we see a small increase in performance
by *removing* inlining:
Before (inlined):
# time ./copy_to_user -w -l 1 -i 10000000 ( x 3 )
real 0m22.063s
real 0m22.059s
real 0m22.076s
After:
# time ./copy_to_user -w -l 1 -i 10000000 ( x 3 )
real 0m21.325s
real 0m21.299s
real 0m21.364s
So as a small performance improvement and to avoid the miscompilation, drop
inlining copy_to/from_user() on 32-bit.
Fixes: 3448890c32 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER")
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Since commit:
af2cf278ef ("x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()")
we no longer free PUDs so that we do not have to synchronize
all PGDs on hot-remove/vfree().
But the new 5-level page table patchset reverted that for 4-level
page tables, in the following commit:
f2a6a70501: ("x86: Convert the rest of the code to support p4d_t")
This patch restores the damage and disables free_pud() if we are in the
4-level page table case, thus avoiding BUG_ON() after hot-remove.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
[ Clarified the changelog and the code comments. ]
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20170624180514.3821-1-jglisse@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>