With the ffz() function as defined in arch/powerpc/include/asm/bitops.h
GCC will not optimise the code in case of constant parameter.
This patch replaces ffz() by the generic function.
The generic ffz(x) expects to never be called with ~x == 0
as written in the comment in include/asm-generic/bitops/ffz.h
The only user of ffz() within arch/powerpc/ is
platforms/512x/mpc5121_ads_cpld.c, which checks if x is not 0xff
For non constant calls, the generated code is doing the same:
unsigned long testffz(unsigned long x)
{
return ffz(x);
}
On PPC32, before the patch:
00000018 <testffz>:
18: 7c 63 18 f9 not. r3,r3
1c: 40 82 00 0c bne 28 <testffz+0x10>
20: 38 60 00 20 li r3,32
24: 4e 80 00 20 blr
28: 7d 23 00 d0 neg r9,r3
2c: 7d 23 18 38 and r3,r9,r3
30: 7c 63 00 34 cntlzw r3,r3
34: 20 63 00 1f subfic r3,r3,31
38: 4e 80 00 20 blr
On PPC32, after the patch:
00000018 <testffz>:
18: 39 23 00 01 addi r9,r3,1
1c: 7d 23 18 78 andc r3,r9,r3
20: 7c 63 00 34 cntlzw r3,r3
24: 20 63 00 1f subfic r3,r3,31
28: 4e 80 00 20 blr
On PPC64, before the patch:
0000000000000030 <.testffz>:
30: 7c 60 18 f9 not. r0,r3
34: 38 60 00 40 li r3,64
38: 4d 82 00 20 beqlr
3c: 7c 60 00 d0 neg r3,r0
40: 7c 63 00 38 and r3,r3,r0
44: 7c 63 00 74 cntlzd r3,r3
48: 20 63 00 3f subfic r3,r3,63
4c: 7c 63 07 b4 extsw r3,r3
50: 4e 80 00 20 blr
On PPC64, after the patch:
0000000000000030 <.testffz>:
30: 38 03 00 01 addi r0,r3,1
34: 7c 03 18 78 andc r3,r0,r3
38: 7c 63 00 74 cntlzd r3,r3
3c: 20 63 00 3f subfic r3,r3,63
40: 4e 80 00 20 blr
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
It often happens to have simultaneous interrupts, for instance
when having double Ethernet attachment. With the current
implementation, we suffer the cost of kernel entry/exit for each
interrupt.
This patch introduces a loop in __do_irq() to handle all interrupts
at once before returning.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
IRQ 0 is a valid HW interrupt. So get_irq() shall return 0 when
there is no irq, instead of returning irq_linear_revmap(... ,0)
Fixes: f2a0bd3753 ("[POWERPC] 8xx: powerpc port of core CPM PIC")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The 8xx has a dedicated exception for breakpoints, that directly
calls do_break()
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The result of (trap == 0x400) is already in is_exec.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Function store_updates_sp() checks whether the faulting
instruction is a store updating r1. Therefore we can limit its calls
to store exceptions.
This patch is an improvement of commit a7a9dcd882 ("powerpc: Avoid
taking a data miss on every userspace instruction miss")
With the same microbenchmark app, run with 500 as argument, on an
MPC885 we get:
Before this patch: 152000 DTLB misses
After this patch: 147000 DTLB misses
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This function has not been used since commit 9494a1e842
("powerpc: use generic fixmap.h)
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The check in hpte_find() should be < and not <= for PAGE_OFFSET
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add --orphan-handling=warn to final link flags. This ensures we can
handle all sections explicitly. This would have caught subtle breakage
such as 7de3b27bac at build-time.
Also bring existing orphan sections into the fold:
- .text.hot and .text.unlikely are compiler generated sections.
- .sdata2, .dynsbss, .plt are used by PPC32
- We previously did not specify DWARF_DEBUG or STABS_DEBUG
- DWARF_DEBUG did not include all DWARF sections that can be emitted
- A number of sections are unused and can be discarded.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Use a tool to check that the location of "fixed sections" are where
we expected them to be, which catches cases the linker script can't
(stubs being added to start of .text section), and which ends up
being neater.
Sample output:
ERROR: start_text address is c000000000008100, should be c000000000008000
ERROR: see comments in arch/powerpc/tools/head_check.sh
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fold in fix from Nick for 4.6 era toolchains]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Very large kernels may require linker stubs for branches from HEAD
text code. The linker may place these stubs before the HEAD text
sections, which breaks the assumption that HEAD text is located at 0
(or the .text section being located at 0x7000/0x8000 on Book3S
kernels).
Provide an option to create a small section just before the .text
section with an empty 256 - 4 bytes, and adjust the start of the .text
section to match. The linker will tend to put stubs in that section
and not break our relative-to-absolute offset assumptions.
This causes a small waste of space on common kernels, but allows large
kernels to build and boot. For now, it is an EXPERT config option,
defaulting to =n, but a reference is provided for it in the build-time
check for such breakage. This is good enough for allyesconfig and
custom users / hackers.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Direct banches from code below __end_interrupts to code above
__end_interrupts when built with CONFIG_RELOCATABLE are disallowed
because they will break when the kernel is not located at 0.
Sample output:
WARNING: Unrelocated relative branches
c000000000000118 bl-> 0xc000000000038fb8 <pnv_restore_hyp_resource>
c00000000000013c b-> 0xc0000000001068a4 <kvm_start_guest>
c000000000000148 b-> 0xc00000000003919c <pnv_wakeup_loss>
c00000000000014c b-> 0xc00000000003923c <pnv_wakeup_noloss>
c0000000000005a4 b-> 0xc000000000106ffc <kvmppc_interrupt_hv>
c000000000001af0 b-> 0xc000000000106ffc <kvmppc_interrupt_hv>
c000000000001b24 b-> 0xc000000000106ffc <kvmppc_interrupt_hv>
c000000000001b58 b-> 0xc000000000106ffc <kvmppc_interrupt_hv>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For final link, the powerpc64 linker generates fpr save/restore
functions on-demand, placing them in the .sfpr section. Starting with
binutils 2.25, these can be provided for non-final links with
--save-restore-funcs. Use that where possible for module links.
This saves about 200 bytes per module (~60kB) on powernv defconfig
build.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There is no need to create a new section for these. Consolidate with
32-bit and just use .text.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
crtsaveres.S is empty with 64-bit builds already, so just don't
build and link it to match the vmlinux build.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use CONFIG_PPC64_BOOT_WRAPPER not CONFIG_PPC32 to fix BE build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The 64-bit linker creates save/restore functions on demand with final
links, so vmlinux does not require crtsavres.o.
Make crtsavres.o extra-y on 64-bit (it is still required by modules).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The arch version is identical except for comments and white space.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
These are completely obvious as all they do is include the asm-generic
versions.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Prevent a kernel panic caused by unintentionally clearing TCR watchdog
bits. At this point in the kernel boot, the watchdog may have already
been enabled by u-boot. The original code's attempt to write to the TCR
register results in an inadvertent clearing of the watchdog
configuration bits, causing the 476 to reset.
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch adds default FSP2 config for main usage.
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add platform code support for FSP2 (476fpe) board.
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On Power9 DD1 due to a hardware bug the Power-Saving Level Status
field (PLS) of the PSSCR for a thread waking up from a deep state can
under-report if some other thread in the core is in a shallow stop
state. The scenario in which this can manifest is as follows:
1) All the threads of the core are in deep stop.
2) One of the threads is woken up. The PLS for this thread will
correctly reflect that it is waking up from deep stop.
3) The thread that has woken up now executes a shallow stop.
4) When some other thread in the core is woken, its PLS will reflect
the shallow stop state.
Thus, the subsequent thread for which the PLS is under-reporting the
wakeup state will not restore the hypervisor resources.
Hence, on DD1 systems, use the Requested Level (RL) field as a
workaround to restore the contents of the hypervisor resources on the
wakeup from the stop state.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some of the SPR values (HID0, MSR, SPRG0) don't change during the run
time of a booted kernel, once they have been initialized.
The contents of these SPRs are lost when the CPUs enter deep stop
states. So instead saving and restoring SPRs from the kernel, use the
stop-api provided by the firmware by which the firmware can restore
the contents of these SPRs to their initialized values after wakeup
from a deep stop state.
Apart from these, program the PSSCR value to that of the deepest stop
state via the stop-api. This will be used to indicate to the
underlying firmware as to what stop state to put the threads that have
been woken up by a special-wakeup.
And while we are at programming SPRs via stop-api, ensure that HID1,
HID4 and HID5 registers which are only available on POWER8 are not
requested to be restored by the firware on POWER9.
Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On wakeup from a deep stop state which is supposed to lose the
hypervisor state, we don't restore the LPCR to the old value but set
it to a "sane" value via cur_cpu_spec->cpu_restore().
The problem is that the "sane" value doesn't include UPRT and the HR
bits which are required to run correctly in Radix mode.
Fix this on POWER9 onwards by restoring the LPCR value whatever it was
before executing the stop instruction.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On POWER8, in case of
- nap: both timebase and hypervisor state is retained.
- fast-sleep: timebase is lost. But the hypervisor state is retained.
- winkle: timebase and hypervisor state is lost.
Hence, the current code for handling exit from a idle state assumes
that if the timebase value is retained, then so is the hypervisor
state. Thus, the current code doesn't restore per-core hypervisor
state in such cases.
But that is no longer the case on POWER9 where we do have stop states
in which timebase value is retained, but the hypervisor state is
lost. So we have to ensure that the per-core hypervisor state gets
restored in such cases.
Fix this by ensuring that even in the case when timebase is retained,
we explicitly check if we are waking up from a deep stop that loses
per-core hypervisor state (indicated by cr4 being eq or gt), and if
this is the case, we restore the per-core hypervisor state.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The lower 8 bits of core_idle_state_ptr tracks the number of non-idle
threads in the core. This is supposed to be initialized to bit-map
corresponding to the threads_per_core. However, currently it is
initialized to PNV_CORE_IDLE_THREAD_BITS (0xFF). This is correct for
POWER8 which has 8 threads per core, but not for POWER9 which has 4
threads per core.
As a result, on POWER9, core_idle_state_ptr gets initialized to
0xFF. In case when all the threads of the core are idle, the bits
corresponding tracking the idle-threads are non-zero. As a result, the
idle entry/exit code fails to save/restore per-core hypervisor state
since it assumes that there are threads in the cores which are still
active.
Fix this by correctly initializing the lower bits of the
core_idle_state_ptr on the basis of threads_per_core.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Allow us to enable IRQ_TIME_ACCOUNTING. Even though we currently
use VIRT_CPU_ACCOUNTING_NATIVE, that option is quite heavy
weight and IRQ_TIME_ACCOUNTING might be better in some cases.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently the DTS defines two partitions at the same addresses, if you
use one, you will corrupt information on the other one. Fix it by
shifting the second partition.
Signed-off-by: Pavel Machek <pavel@denx.de>
[mpe: Reconstruct change log from email thread]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Experiments with the netperf benchmark indicated that the size selecting
VMX-based copies in __copy_tofrom_user_power7() was suboptimal on POWER8.
Measurements showed that parity was in the neighbourhood of 3328 bytes,
rather than greater than 4096. The change gives a 1.5-2.0% improvement in
performance for 4096-byte buffers, reducing the relative time spent in
__copy_tofrom_user_power7() from approximately 7% to approximately 5% in
the TCP_RR benchmark.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rearrange the code so that mode and badaddr are only defined when
they're used.
Also unsplit the string for easier grepping, and switch from CONFIG_8xx
which is deprecated to CONFIG_PPC_8xx.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Fixes: a7cd88da97 ("powerpc/powernv: Move CPU-Offline idle state invocation from smp.c to idle.c")
Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
OPAL_CALL uses SRR[01] with MSR_RI=1, which gets corrupted if there
is an interleaving system reset or machine check interrupt.
Use HSRR[01] instead, which does not require MSR_RI=0.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
FIXUP_ENDIAN uses SRR[01] with MSR_RI=1, which gets corrupted if there
is an interleaving system reset or machine check interrupt.
Set MSR_RI=0 before setting SRRs. The rfid will restore MSR.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Fix running SPU programs on Cell, and a few other minor fixes.
Thanks to:
Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas Piggin.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZKURPAAoJEFHr6jzI4aWAhckQAJxZZHt2OMbdNu0PxHdhZgxo
+eSODIF0jvzBnYs/Pe9aqqrxuONxW+ioclyUIVYLUlHwLjCGf7x2Y5tJe0OmEff6
ZOaUUcwECKw4cy2UJY6NCGv0nw/8INTDN5xPcQq9M8gExmX6plTmbnQg8Y10ONdQ
LYu36GWyXF4ygblvLo7kXs8tuZYKozO6kPRqxiQ3zML2dOAyqWqPwpnoWSc6c7oR
W+z/Vuxe3lTR+QHbfvnSpQhmdVi+WEnwFvgNmIise5R9Jd30Q1f1vES5E089ifmN
b0Qi5/gkb6YWBkROvxTARFRdmU0/YPNDFWUsuyHJB/Nz1MnqqXx5X+5KpqqinPya
azVoYW010x2zawm0aX+BF/WeH5ymsl++R84/aO8UR0fA2AIwEOQeLGWZvaZb8CMl
9vd3NqCJ+diBhgCHiHp80pjD978bqt7Ls1nfbHhYTJ31HRT8Yz/ympWOhLE6rp+t
kGR+UOHduaZWK3KHoE2WIoUFJuQMvRgFmjoy2G+YaK/PcUc8OA+90v1665fnbk+N
wmZyAirP39gveHkHXDywqbEjN4CSMgsqrRW/KwPo0b4mf2m3rsIAshO9pBbZRv+P
evhrAkCYRv5zGbGIYJ/TiEyball+8NQzxzoYmMzq62pjE27gyIe94Sqy80U4zyOC
RqqUxflOBgMDC8Ufc30u
=EM32
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fix running SPU programs on Cell, and a few other minor fixes.
Thanks to Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas
Piggin"
* tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions
powerpc/spufs: Fix hash faults for kernel regions
powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N
powerpc/powernv/npu-dma.c: Fix opal_npu_destroy_context() call
selftests/powerpc: Fix TM resched DSCR test with some compilers
Pull x86 fixes from Thomas Gleixner:
"A series of fixes for X86:
- The final fix for the end-of-stack issue in the unwinder
- Handle non PAT systems gracefully
- Prevent access to uninitiliazed memory
- Move early delay calaibration after basic init
- Fix Kconfig help text
- Fix a cross compile issue
- Unbreak older make versions"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/timers: Move simple_udelay_calibration past init_hypervisor_platform
x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives()
x86/PAT: Fix Xorg regression on CPUs that don't support PAT
x86/watchdog: Fix Kconfig help text file path reference to lockup watchdog documentation
x86/build: Permit building with old make versions
x86/unwind: Add end-of-stack check for ftrace handlers
Revert "x86/entry: Fix the end of the stack for newly forked tasks"
x86/boot: Use CROSS_COMPILE prefix for readelf
Pull RAS fixes from Thomas Gleixner:
"Two fixlets for RAS:
- Export memory_error() so the NFIT module can utilize it
- Handle memory errors in NFIT correctly"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
acpi, nfit: Fix the memory error check in nfit_handle_mce()
x86/MCE: Export memory_error()
One was simply a memory leak where not all was being freed that should
have been in releasing a file pointer on set_graph_function.
Then Thomas found that the ftrace trampolines were marked for read/write
as well as execute. To shrink the possible attack surface, he added
calls to set them to ro. Which also uncovered some other issues with
freeing module allocated memory that had its permissions changed.
Kprobes had a similar issue which is fixed and a selftest was added
to trigger that issue again.
-----BEGIN PGP SIGNATURE-----
iQExBAABCAAbBQJZKOiVFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
vBoH/jxVozuAEVCv+Nbj6fhRxe4emjo0lZZb32EbEaSV/nUQGqHIZFdDQtbt+ld+
sn06/BSMBI+L4BqLj1BCAW0e/zIn/4birIg53SX5jQwc3AlhUG7HS2d+RJZZCrp9
Zofq9L6xZ4Hl2XjkPXqwEgtrwxQtkIPLlJqeYDJ6BVrlPfOPEwB7bfR7B684wiYT
6h2Qo7f/ZQzgJ1sK8N2IjHEnAgE08KCYcj4IB4WHJk6SqQz3bv1Y00WBg2UQihVT
TPPSVhYLnrSw53fxyALqZbHo2DvnQf1TnNadWxvSIpbvgm/T5GG60FDtvHgNfbwz
yKuKAog+P9xBLkoAcfvODLY9O5s=
=75TZ
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace fixes from Steven Rostedt:
"There's been a few memory issues found with ftrace.
One was simply a memory leak where not all was being freed that should
have been in releasing a file pointer on set_graph_function.
Then Thomas found that the ftrace trampolines were marked for
read/write as well as execute. To shrink the possible attack surface,
he added calls to set them to ro. Which also uncovered some other
issues with freeing module allocated memory that had its permissions
changed.
Kprobes had a similar issue which is fixed and a selftest was added to
trigger that issue again"
* tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
x86/ftrace: Make sure that ftrace trampolines are not RWX
x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()
selftests/ftrace: Add a testcase for many kprobe events
kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
ftrace: Fix memory leak in ftrace_graph_release()
ftrace use module_alloc() to allocate trampoline pages. The mapping of
module_alloc() is RWX, which makes sense as the memory is written to right
after allocation. But nothing makes these pages RO after writing to them.
Add proper set_memory_rw/ro() calls to protect the trampolines after
modification.
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705251056410.1862@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
With function tracing starting in early bootup and having its trampoline
pages being read only, a bug triggered with the following:
kernel BUG at arch/x86/mm/pageattr.c:189!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc2-test+ #3
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
task: ffffffffb4222500 task.stack: ffffffffb4200000
RIP: 0010:change_page_attr_set_clr+0x269/0x302
RSP: 0000:ffffffffb4203c88 EFLAGS: 00010046
RAX: 0000000000000046 RBX: 0000000000000000 RCX: 00000001b6000000
RDX: ffffffffb4203d40 RSI: 0000000000000000 RDI: ffffffffb4240d60
RBP: ffffffffb4203d18 R08: 00000001b6000000 R09: 0000000000000001
R10: ffffffffb4203aa8 R11: 0000000000000003 R12: ffffffffc029b000
R13: ffffffffb4203d40 R14: 0000000000000001 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff9a639ea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff9a636b384000 CR3: 00000001ea21d000 CR4: 00000000000406b0
Call Trace:
change_page_attr_clear+0x1f/0x21
set_memory_ro+0x1e/0x20
arch_ftrace_update_trampoline+0x207/0x21c
? ftrace_caller+0x64/0x64
? 0xffffffffc029b000
ftrace_startup+0xf4/0x198
register_ftrace_function+0x26/0x3c
function_trace_init+0x5e/0x73
tracer_init+0x1e/0x23
tracing_set_tracer+0x127/0x15a
register_tracer+0x19b/0x1bc
init_function_trace+0x90/0x92
early_trace_init+0x236/0x2b3
start_kernel+0x200/0x3f5
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x17c/0x18f
secondary_startup_64+0x9f/0x9f
? secondary_startup_64+0x9f/0x9f
Interrupts should not be enabled at this early in the boot process. It is
also fine to leave interrupts enabled during this time as there's only one
CPU running, and on_each_cpu() means to only run on the current CPU.
If early_boot_irqs_disabled is set, it is safe to run cpu_flush_range() with
interrupts disabled. Don't trigger a BUG_ON() in that case.
Link: http://lkml.kernel.org/r/20170526093717.0be3b849@gandalf.local.home
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix kprobes to set(recover) RWX bits correctly on trampoline
buffer before releasing it. Releasing readonly page to
module_memfree() crash the kernel.
Without this fix, if kprobes user register a bunch of kprobes
in function body (since kprobes on function entry usually
use ftrace) and unregister it, kernel hits a BUG and crash.
Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: d0381c81c2 ("kprobes/x86: Set kprobes pages read-only")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
This ensures that adjustments to x86_platform done by the hypervisor
setup is already respected by this simple calibration.
The current user of this, introduced by 1b5aeebf3a ("x86/earlyprintk:
Add support for earlyprintk via USB3 debug port"), comes much later
into play.
Fixes: dd759d93f4 ("x86/timers: Add simple udelay calibration")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: http://lkml.kernel.org/r/5e89fe60-aab3-2c1c-aba8-32f8ad376189@siemens.com
Providing "scv" support to userspace requires kernel support, so it
must be advertised as independently to the base ISA 3 instruction set.
The darn instruction relies on firmware enablement, so it has been
decided to split this out from the core ISA 3 feature as well.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit ac29c64089 ("powerpc/mm: Replace _PAGE_USER with
_PAGE_PRIVILEGED") swapped _PAGE_USER for _PAGE_PRIVILEGED, and
introduced check_pte_access() which denied kernel access to
non-_PAGE_PRIVILEGED pages.
However, it didn't add _PAGE_PRIVILEGED to the hash fault handler
for spufs' kernel accesses, so the DMAs required to establish SPE
memory no longer work.
This change adds _PAGE_PRIVILEGED to the hash fault handler for
kernel accesses.
Fixes: ac29c64089 ("powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Reported-by: Sombat Tragolgosol <sombat3960@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently if you disable CONFIG_PPC_RADIX_MMU you'll crash on boot on
a P9. This is because we still set MMU_FTR_TYPE_RADIX via
ibm,pa-features and MMU_FTR_TYPE_RADIX is what's used for code patching
in much of the asm code (ie. slb_miss_realmode)
This patch fixes the problem by stopping MMU_FTR_TYPE_RADIX from being
set from ibm.pa-features.
We may eventually end up removing the CONFIG_PPC_RADIX_MMU option
completely but until then this fixes the issue.
Fixes: 17a3dd2f5f ("powerpc/mm/radix: Use firmware feature to enable Radix MMU")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
opal_npu_destroy_context() should be called with the NPU PHB, not the
PCIe PHB.
Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In the current form of the code, if a->replacementlen is 0, the reference
to *insnbuf for comparison touches potentially garbage memory. While it
doesn't affect the execution flow due to the subsequent a->replacementlen
comparison, it is (rightly) detected as use of uninitialized memory by a
runtime instrumentation currently under my development, and could be
detected as such by other tools in the future, too (e.g. KMSAN).
Fix the "false-positive" by reordering the conditions to first check the
replacement instruction length before referencing specific opcode bytes.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/20170524135500.27223-1-mjurczyk@google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.
However, on AMD K6-3 CPUs, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.
The result of this bug is that a kernel warning is produced when attempting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers:
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
...
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
To fix the bug change pat_enabled() so that it returns true only if PAT
initialization was actually done.
Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT
feature on the processor that is being initialized.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
At least Make 3.82 dislikes the tab in front of the $(warning) function:
arch/x86/Makefile:162: *** recipe commences before first target. Stop.
Let's be gentle.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1944fcd8-e3df-d1f7-c0e4-60aeb1917a24@siemens.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dave Jones and Steven Rostedt reported unwinder warnings like the
following:
WARNING: kernel stack frame pointer at ffff8800bda0ff30 in sshd:1090 has bad value 000055b32abf1fa8
In both cases, the unwinder was attempting to unwind from an ftrace
handler into entry code. The callchain was something like:
syscall entry code
C function
ftrace handler
save_stack_trace()
The problem is that the unwinder's end-of-stack logic gets confused by
the way ftrace lays out the stack frame (with fentry enabled).
I was able to recreate this warning with:
echo call_usermodehelper_exec_async:stacktrace > /sys/kernel/debug/tracing/set_ftrace_filter
(exit login session)
I considered fixing this by changing the ftrace code to rewrite the
stack to make the unwinder happy. But that seemed too intrusive after I
implemented it. Instead, just add another check to the unwinder's
end-of-stack logic to detect this special case.
Side note: We could probably get rid of these end-of-stack checks by
encoding the frame pointer for syscall entry just like we do for
interrupt entry. That would be simpler, but it would also be a lot more
intrusive since it would slightly affect the performance of every
syscall.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: live-patching@vger.kernel.org
Fixes: c32c47c68a ("x86/unwind: Warn on bad frame pointer")
Link: http://lkml.kernel.org/r/671ba22fbc0156b8f7e0cfa5ab2a795e08bc37e1.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Petr Mladek reported the following warning when loading the livepatch
sample module:
WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0
...
Call Trace:
__schedule+0x273/0x820
schedule+0x36/0x80
kthreadd+0x305/0x310
? kthread_create_on_cpu+0x80/0x80
? icmp_echo.part.32+0x50/0x50
ret_from_fork+0x2c/0x40
That warning means the end of the stack is no longer recognized as such
for newly forked tasks. The problem was introduced with the following
commit:
ff3f7e2475 ("x86/entry: Fix the end of the stack for newly forked tasks")
... which was completely misguided. It only partially fixed the
reported issue, and it introduced another bug in the process. None of
the other entry code saves the frame pointer before calling into C code,
so it doesn't make sense for ret_from_fork to do so either.
Contrary to what I originally thought, the original issue wasn't related
to newly forked tasks. It was actually related to ftrace. When entry
code calls into a function which then calls into an ftrace handler, the
stack frame looks different than normal.
The original issue will be fixed in the unwinder, in a subsequent patch.
Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: live-patching@vger.kernel.org
Fixes: ff3f7e2475 ("x86/entry: Fix the end of the stack for newly forked tasks")
Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The description of the connection between the dwmmc (SDIO) controller and
the Wifi chip, which is attached to the SDIO bus is wrong. Currently the
SDIO card can't be detected and thus the Wifi doesn't work.
Let's fix this by assigning the correct vmmc supply, which is the always on
regulator VDD_3V3 and remove the WLAN enable regulator altogether. Then to
properly deal with the power on/off sequence, add a mmc-pwrseq node to
describe the resources needed to detect the SDIO card.
Except for the WLAN enable GPIO and its corresponding assert/de-assert
delays, the mmc-pwrseq node also contains a handle to a clock provided by
the hi655x pmic. This clock is also needed to be able to turn on the WiFi
chip.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Move the board specific descriptions for the dwmmc nodes in the hi6220 SoC
dtsi, into the hikey dts as it's there these belongs.
While changing this, let's take the opportunity to drop the use of the
"ti,non-removable" binding for one of the dwmmc device nodes, as it's not a
valid binding and not used. Drop also the unnecessary use of "num-slots =
<0x1>" for all of the dwmmc nodes, as there is no need to set this since
when default number of slots is one.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Add these regulators to better describe the HW, but also because those is
needed in following changes.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The regulator is a part of the hikey board, therefore let's move it from
the hi6220 SoC dtsi file into the hikey dts file . Let's also rename the
regulator according to the datasheet (5V_HUB) to better reflect the HW.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The hi655x PMIC provides the regulators but also a clock. The latter is
missing so let's add it. This clock is used by WiFi/Bluetooth chip, but
that connection is done in a separate change on top of this one.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
[Ulf: Split patch and updated changelog]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The code to fetch a 64-bit value from user space was entirely buggered,
and has been since the code was merged in early 2016 in commit
b2f680380d ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit
kernels").
Happily the buggered routine is almost certainly entirely unused, since
the normal way to access user space memory is just with the non-inlined
"get_user()", and the inlined version didn't even historically exist.
The normal "get_user()" case is handled by external hand-written asm in
arch/x86/lib/getuser.S that doesn't have either of these issues.
There were two independent bugs in __get_user_asm_u64():
- it still did the STAC/CLAC user space access marking, even though
that is now done by the wrapper macros, see commit 11f1a4b975
("x86: reorganize SMAP handling in user space accesses").
This didn't result in a semantic error, it just means that the
inlined optimized version was hugely less efficient than the
allegedly slower standard version, since the CLAC/STAC overhead is
quite high on modern Intel CPU's.
- the double register %eax/%edx was marked as an output, but the %eax
part of it was touched early in the asm, and could thus clobber other
inputs to the asm that gcc didn't expect it to touch.
In particular, that meant that the generated code could look like
this:
mov (%eax),%eax
mov 0x4(%eax),%edx
where the load of %edx obviously was _supposed_ to be from the 32-bit
word that followed the source of %eax, but because %eax was
overwritten by the first instruction, the source of %edx was
basically random garbage.
The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark
the 64-bit output as early-clobber to let gcc know that no inputs should
alias with the output register.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@kernel.org # v4.8+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al noticed that unsafe_put_user() had type problems, and fixed them in
commit a7cc722fff ("fix unsafe_put_user()"), which made me look more
at those functions.
It turns out that unsafe_get_user() had a type issue too: it limited the
largest size of the type it could handle to "unsigned long". Which is
fine with the current users, but doesn't match our existing normal
get_user() semantics, which can also handle "u64" even when that does
not fit in a long.
While at it, also clean up the type cast in unsafe_put_user(). We
actually want to just make it an assignment to the expected type of the
pointer, because we actually do want warnings from types that don't
convert silently. And it makes the code more readable by not having
that one very long and complex line.
[ This patch might become stable material if we ever end up back-porting
any new users of the unsafe uaccess code, but as things stand now this
doesn't matter for any current existing uses. ]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Export the function which checks whether an MCE is a memory error to
other users so that we can reuse the logic. Drop the boot_cpu_data use,
while at it, as mce.cpuvendor already has the CPU vendor in there.
Integrate a piece from a patch from Vishal Verma
<vishal.l.verma@intel.com> to export it for modules (nfit).
The main reason we're exporting it is that the nfit handler
nfit_handle_mce() needs to detect a memory error properly before doing
its recovery actions.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170519093915.15413-2-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull misc uaccess fixes from Al Viro:
"Fix for unsafe_put_user() (no callers currently in mainline, but
anyone starting to use it will step into that) + alpha osf_wait4()
infoleak fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
osf_wait4(): fix infoleak
fix unsafe_put_user()
__put_user_size() relies upon its first argument having the same type as what
the second one points to; the only other user makes sure of that and
unsafe_put_user() should do the same.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The boot code Makefile contains a straight 'readelf' invocation. This
causes build warnings in cross compile environments, when there is no
unprefixed readelf accessible via $PATH.
Add the missing $(CROSS_COMPILE) prefix.
[ tglx: Rewrote changelog ]
Fixes: 98f7852537 ("x86/boot: Refuse to build with data relocations")
Signed-off-by: Rob Landley <rob@landley.net>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: "H.J. Lu" <hjl.tools@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/ced18878-693a-9576-a024-113ef39a22c0@landley.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
ARM:
- A fix for a build failure introduced in -rc1 when tracepoints are
enabled on 32-bit ARM.
- Disabling use of stack pointer protection in the hyp code which can
cause panics.
- A handful of VGIC fixes.
- A fix to the init of the redistributors on GICv3 systems that
prevented boot with kvmtool on GICv3 systems introduced in -rc1.
- A number of race conditions fixed in our MMU handling code.
- A fix for the guest being able to program the debug extensions for
the host on the 32-bit side.
PPC:
- Fixes for build failures with PR KVM configurations.
- A fix for a host crash that can occur on POWER9 with radix guests.
x86:
- Fixes for nested PML and nested EPT.
- A fix for crashes caused by reserved bits in SSE MXCSR that could
have been set by userspace.
- An optimization of halt polling that fixes high CPU overhead.
- Fixes for four reports from Dan Carpenter's static checker.
- A protection around code that shouldn't have been preemptible.
- A fix for port IO emulation.
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJZHzY3AAoJEED/6hsPKofocI8H/AiOHXi6AC/3s9Ok3IbN/Wp6
+xSm1yqgxitGhpmKIJQyKMUTV0t8SblRV2nxvW7/MEyfl7vztiyWENaVFc6pO6N7
GbnLvdImZ9aypoBaxVOY8WG/CHw2XZ7oUYyBIGrWECH3k+fptBNdISFK3D76+4G2
+tAuWSpKSQFwjGxtreUSlnvQBp6Tjh/PqTyxslPs4zYCL6UPKSSVAoxy4yOKj3AX
G03tx/1U1n/hSJHub9RFqho4dhVGT/p3V6oppZmS1g/ZqGPQwK1wxlYquHOtORFR
Iq8LdkNQwTdkLlTTOG+tamYSfzn0+KhczfWjIh6ZEb79ARrUSnBU4Awpvom1C2A=
=B6Rl
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"ARM:
- a fix for a build failure introduced in -rc1 when tracepoints are
enabled on 32-bit ARM.
- disable use of stack pointer protection in the hyp code which can
cause panics.
- a handful of VGIC fixes.
- a fix to the init of the redistributors on GICv3 systems that
prevented boot with kvmtool on GICv3 systems introduced in -rc1.
- a number of race conditions fixed in our MMU handling code.
- a fix for the guest being able to program the debug extensions for
the host on the 32-bit side.
PPC:
- fixes for build failures with PR KVM configurations.
- a fix for a host crash that can occur on POWER9 with radix guests.
x86:
- fixes for nested PML and nested EPT.
- a fix for crashes caused by reserved bits in SSE MXCSR that could
have been set by userspace.
- an optimization of halt polling that fixes high CPU overhead.
- fixes for four reports from Dan Carpenter's static checker.
- a protection around code that shouldn't have been preemptible.
- a fix for port IO emulation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (27 commits)
KVM: x86: prevent uninitialized variable warning in check_svme()
KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh()
KVM: x86: zero base3 of unusable segments
KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation
KVM: x86: Fix potential preemption when get the current kvmclock timestamp
KVM: Silence underflow warning in avic_get_physical_id_entry()
KVM: arm/arm64: Hold slots_lock when unregistering kvm io bus devices
KVM: arm/arm64: Fix bug when registering redist iodevs
KVM: x86: lower default for halt_poll_ns
kvm: arm/arm64: Fix use after free of stage2 page table
kvm: arm/arm64: Force reading uncached stage2 PGD
KVM: nVMX: fix EPT permissions as reported in exit qualification
KVM: VMX: Don't enable EPT A/D feature if EPT feature is disabled
KVM: x86: Fix load damaged SSEx MXCSR register
kvm: nVMX: off by one in vmx_write_pml_buffer()
KVM: arm: rename pm_fake handler to trap_raz_wi
KVM: arm: plug potential guest hardware debug leakage
kvm: arm/arm64: Fix race in resetting stage2 PGD
KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2 registers
KVM: arm/arm64: vgic-v3: Do not use Active+Pending state for a HW interrupt
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABAgAGBQJZHx/IAAoJELDendYovxMvzegIAIOyDATZsyLnbDnTunOmYqLJ
n06v50N3KwQ+pegJyz4lHdTryI10/TEUzvuT4v/V9B0sHimNRJcE7ClvRVPEaFrs
4y459kKGXRpXXAvS2r0WIY3NhwP/Num9+duVY5lInJ6caq+/JDm3S1tL2HeQ9gl1
SDuI6IMV3q12Agk6jgbvwd1XBh3wbj8Z6SOx3DAchqY/kbdy6tS4y5CR93mKpjs3
LsVyPvY2IOLWCSrPsdloM4l7lMoVmd/1tt6NfzymepIxQbIS3KWo5AwBsoM0cVfs
KGb4T3+H8uwmpyWjgibsayr31cC7LIulEqLtqZNyycpIZGR5TlZ01KEPSMKn78s=
=Boz3
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.12b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Some fixes for the new Xen 9pfs frontend and some minor cleanups"
* tag 'for-linus-4.12b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: make xen_flush_tlb_all() static
xen: cleanup pvh leftovers from pv-only sources
xen/9pfs: p9_trans_xen_init and p9_trans_xen_exit can be static
xen/9pfs: fix return value check in xen_9pfs_front_probe()
We had a small batch of fixes before -rc1, but here is a larger one. It
contains a backmerge of 4.12-rc1 since some of the downstream branches we
merge had that as base; at the same time we already had merged contents
before -rc1 and rebase wasn't the right solution.
A mix of random smaller fixes and a few things worth pointing out:
- We've started telling people to avoid cross-tree shared branches if all
they're doing is picking up one or two DT-used constants from a
shared include file, and instead to use the numeric values on first
submission. Follow-up moving over to symbolic names are sent in right
after -rc1, i.e. here. It's only a few minor patches of this type.
- Linus Walleij and others are resurrecting the 'Gemini' platform, and
wanted a cut-down platform-specific defconfig for it. So I picked that
up for them.
- Rob Herring ran 'savedefconfig' on arm64, it's a bit churny but it helps
people to prepare patches since it's a pain when defconfig and current
savedefconfig contents differs too much.
- Devicetree additions for some pinctrl drivers for Armada that were
merged this window. I'd have preferred to see those earlier but it's not
a huge deail.
The biggest change worth pointing out though since it's touching other
parts of the tree: We added prefixes to be used when cross-including
DT contents between arm64 and arm, allowing someone to #include
<arm/foo.dtsi> from arm64, and likewise. As part of that, we needed
arm/foo.dtsi to work on arm as well. The way I suggested this to Heiko
resulted in a recursive symlink.
Instead, I've now moved it out of arch/*/boot/dts/include, into a shared
location under scripts/dtc. While I was at it, I consolidated so all
architectures now behave the same way in this manner.
Rob Herring (DT maintainer) has acked it. I cc:d most other arch
maintainers but nobody seems to care much; it doesn't really affect them
since functionality is unchanged for them by default.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZH0gEAAoJEIwa5zzehBx3eCcQAJX55nWjTV/ankFWyaQiXZx1
JhcThxugqPYviYFFTpI3LZnZ0snWbZBNfkoju8ukmzIiqoO/eDlB+LVz6PVWfCIl
4egZZZF1tgxEFoQQ71WKpF1hj0pKccCugHX+5uBDID3s9vjxgQS1Gf1G5ZeFrqbd
m9brxbouGsZMscuWb59K7ayIXO6D4C2hqQqJtGrOZc2jfLs9rZBchDVSQ28sRNQy
qXIcAgH+D1QWfbAi0+cI6opnWmEdcofO5Uge8KzK1wO0HYzO5GQJw1KbM/AAJ7+Y
JtPEWhuUKl8aou6515rFPD7yjFaMtfbL0+0UeKS2TRGz+dSCoSs1kTyJ4cpNAUCT
E3hOLYKzq8rbxcGwEqfp4JjktpWSPGGhEbp4lvNV1gk9A0MLHPnidLCKSoLyCkN0
3qmmlrt4hSCpF07IvY7hWUALHIOsRPtIdbaOMzAyzcWkzu/DMmQ3lFdt7Bgi3AbB
j0Phtz0TR7X6A/1gAxZDGjHaYaEG6KR9ufJMyCNtgGUaKeMZakthbYSz8MdXIq5X
zKqL2ZyPKNq6zHZbvc3yIiYmVKubT9t+8Wc4AjXPNdWgR455V0GSlmf3XCA8rAp7
hISzE4CD4N/YIKNPukt4kcJY7TBpcOZxfquMfBxLEqke+GxJL80CGaOf8iZb3ipM
R697L88FstLhSNhEl/gu
=2EGB
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"We had a small batch of fixes before -rc1, but here is a larger one.
It contains a backmerge of 4.12-rc1 since some of the downstream
branches we merge had that as base; at the same time we already had
merged contents before -rc1 and rebase wasn't the right solution.
A mix of random smaller fixes and a few things worth pointing out:
- We've started telling people to avoid cross-tree shared branches if
all they're doing is picking up one or two DT-used constants from a
shared include file, and instead to use the numeric values on first
submission. Follow-up moving over to symbolic names are sent in
right after -rc1, i.e. here. It's only a few minor patches of this
type.
- Linus Walleij and others are resurrecting the 'Gemini' platform,
and wanted a cut-down platform-specific defconfig for it. So I
picked that up for them.
- Rob Herring ran 'savedefconfig' on arm64, it's a bit churny but it
helps people to prepare patches since it's a pain when defconfig
and current savedefconfig contents differs too much.
- Devicetree additions for some pinctrl drivers for Armada that were
merged this window. I'd have preferred to see those earlier but
it's not a huge deail.
The biggest change worth pointing out though since it's touching other
parts of the tree: We added prefixes to be used when cross-including
DT contents between arm64 and arm, allowing someone to #include
<arm/foo.dtsi> from arm64, and likewise. As part of that, we needed
arm/foo.dtsi to work on arm as well. The way I suggested this to Heiko
resulted in a recursive symlink.
Instead, I've now moved it out of arch/*/boot/dts/include, into a
shared location under scripts/dtc. While I was at it, I consolidated
so all architectures now behave the same way in this manner.
Rob Herring (DT maintainer) has acked it. I cc:d most other arch
maintainers but nobody seems to care much; it doesn't really affect
them since functionality is unchanged for them by default"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (29 commits)
arm64: dts: rockchip: fix include reference
firmware: ti_sci: fix strncat length check
ARM: remove duplicate 'const' annotations'
arm64: defconfig: enable options needed for QCom DB410c board
arm64: defconfig: sync with savedefconfig
ARM: configs: add a gemini defconfig
devicetree: Move include prefixes from arch to separate directory
ARM: dts: dra7: Reduce cpu thermal shutdown temperature
memory: omap-gpmc: Fix debug output for access width
ARM: dts: LogicPD Torpedo: Fix camera pin mux
ARM: dts: omap4: enable CEC pin for Pandaboard A4 and ES
ARM: dts: gta04: fix polarity of clocks for mcbsp4
ARM: dts: dra7: Add power hold and power controller properties to palmas
soc: imx: add PM dependency for IMX7_PM_DOMAINS
ARM: dts: imx6sx-sdb: Remove OPP override
ARM: dts: imx53-qsrb: Pulldown PMIC IRQ pin
soc: bcm: brcmstb: Correctly match 7435 SoC
tee: add ARM_SMCCC dependency
ARM: omap2+: make omap4_get_cpu1_ns_pa_addr declaration usable
ARM64: dts: mediatek: configure some fixed mmc parameters
...
- Avoid taking a mutex in the secondary CPU bring-up path when
interrupts are disabled
- Ignore perf exclude_hv when the kernel is running in Hyp mode
- Remove redundant instruction in cmpxchg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZHva5AAoJEGvWsS0AyF7xjGQP/ibkFaPSSe1PfpM3pC3mdyv1
2o1oChoKKh4u4VR5gZs2nAXZiOoZjPByvnzrJiF9k5KThkTJechr6PS0srUHtxVt
hpReVgsnuqZ4iX68EsiMY+ayoMmBpHGBY9ErlbWPHpgttMHqq7xiMcb/+lSxHoxX
GXrsz5J2KaHv7zNQY3nx+Ear592n7WdrBvUpBYh9jyGQWlC2cxKV/1RbcCa76MGx
33xKExL/NTsTy3cnP8gjVL95RmJCm4qfDGJky/UC8tDWnS92fXt5HHbkgWNCFIgD
eL6hNyZtl4Sxmfcn1QwzBiTL2+VIsQhuqRW9gH0PiBP15TFh0nfzegrlylJ2gw6A
qmq+fos1zyYdyzA43IDfrmbYJb7MD2pwuhgg8yDkIEguUUgMhdnd7V8nmgzAT8g2
nX1FwqZxz9+wBZHfP5gEEiIhCdOkbU6zFKWXlhxVluUkPDUdUKIWt80V6jGljJXj
Np15Ld7joVcH06g1bwiUziNH8q/qpIE2YKxXKPHs9LfrEPtpUwK372rSeQFa28XE
vxqbXyhfPujXNukCyVEMWYzzR3qE2hb3nh/dU6Oom+W6ZSE3f6s26iOM0Jo34nIN
hZGws5onbIWOHhO34/KtRoyDtJDNM6I6Yjzn5QZntFf3HKBSI7s7VVntoZSSa1Gm
oG0Fuy13XSqyBuxw70nL
=eZ2l
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes/cleanups from Catalin Marinas:
- Avoid taking a mutex in the secondary CPU bring-up path when
interrupts are disabled
- Ignore perf exclude_hv when the kernel is running in Hyp mode
- Remove redundant instruction in cmpxchg
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/cpufeature: don't use mutex in bringup path
arm64: perf: Ignore exclude_hv when kernel is running in HYP
arm64: Remove redundant mov from LL/SC cmpxchg
The headline is a fix for FP/VMX register corruption when using transactional
memory, and a new selftest to go with it.
Then there's the virt_addr_valid() fix, currently HARDENDED_USERCOPY is tripping
on that causing some machines to crash.
A few other fairly minor fixes for long tail things, and a couple of fixes for
code we just merged.
Thanks to:
Breno Leitao, Gautham R. Shenoy, Michael Neuling, Naveen N. Rao. Nicholas
Piggin, Paul Mackerras.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZHrn+AAoJEFHr6jzI4aWAgAIQAKv7pthWNkQV+prW9rJqgADN
11V/2Ldjy6oZ0871GDn1bElzkaKMTNLlTYsm2+RAHS1p53+SVlLHig5FdoxqBptj
X6N5kUONgYA+jBBXriT/UC91tAJNR+xHmB8QEoFUtUpUBaZ66NVGYeGwlgnP1/Qm
rRmViKuLgXPzvJ2Is6R/SbDypyU5W6Rb7n3fAd1pPQN7DlJoF6WDgALblvz2akwg
mQaW8PtIbUQGrXqQqQJ2R3o8D+n9M4xh58NEmHjTYiEHBKBOz8G2RCOrnb1PtSyh
kDLBZqAaQF7OW089wwzB2fq6kC+z8iENk3iWFk+X64lH0KOnZxZPY2NGcFqEUjLs
3BoO07fksw9waysDFRRDIcj7ECYNET+Zt3kOZoZcWXdIJc+8aNo7CHXSC1DRC/ar
GU0zRoH9dvBFfWyvjJnuOspbgHi/R+ODDgxFwqTMmCKUSGpBMkQAxSJqPh6qnyH6
HoLEVjFrsT7qfvSi6OWuENfKMb8j3kPuBVeZammCWu2V8DIL+l+WPusOwmOXkCo7
i/2i3lNuo8k5uuUqRUNMJk8t1+kUIKt7HJHwFGROv1H7173C2HxLTohxl0ZhSlf8
Xe2epWAI4eL0/SPvB8Pi8FwAzb2+FP+DiC6a+cI8YoIz2+UjwiZBUQgEeQR6ETXc
aI5uk3lnSFd62839Cezs
=KtM/
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"The headliner is a fix for FP/VMX register corruption when using
transactional memory, and a new selftest to go with it.
Then there's the virt_addr_valid() fix, currently HARDENDED_USERCOPY
is tripping on that causing some machines to crash.
A few other fairly minor fixes for long tail things, and a couple of
fixes for code we just merged.
Thanks to: Breno Leitao, Gautham Shenoy, Michael Neuling, Naveen Rao.
Nicholas Piggin, Paul Mackerras"
* tag 'powerpc-4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix virt_addr_valid() etc. on 64-bit hash
powerpc/mm: Fix crash in page table dump with huge pages
powerpc/kprobes: Fix handling of instruction emulation on probe re-entry
powerpc/powernv: Set NAPSTATELOST after recovering paca on P9 DD1
selftests/powerpc: Test TM and VMX register state
powerpc/tm: Fix FP and VMX register corruption
powerpc/modules: If mprofile-kernel is enabled add it to vermagic
get_msr() of MSR_EFER is currently always going to succeed, but static
checker doesn't see that far.
Don't complicate stuff and just use 0 for the fallback -- it means that
the feature is not present.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Static analysis noticed that pmu->nr_arch_gp_counters can be 32
(INTEL_PMC_MAX_GENERIC) and therefore cannot be used to shift 'int'.
I didn't add BUILD_BUG_ON for it as we have a better checker.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 25462f7f52 ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Static checker noticed that base3 could be used uninitialized if the
segment was not present (useable). Random stack values probably would
not pass VMCS entry checks.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 1aa366163b ("KVM: x86 emulator: consolidate segment accessors")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Huawei folks reported a read out-of-bounds vulnerability in kvm pio emulation.
- "inb" instruction to access PIT Mod/Command register (ioport 0x43, write only,
a read should be ignored) in guest can get a random number.
- "rep insb" instruction to access PIT register port 0x43 can control memcpy()
in emulator_pio_in_emulated() to copy max 0x400 bytes but only read 1 bytes,
which will disclose the unimportant kernel memory in host but no crash.
The similar test program below can reproduce the read out-of-bounds vulnerability:
void hexdump(void *mem, unsigned int len)
{
unsigned int i, j;
for(i = 0; i < len + ((len % HEXDUMP_COLS) ? (HEXDUMP_COLS - len % HEXDUMP_COLS) : 0); i++)
{
/* print offset */
if(i % HEXDUMP_COLS == 0)
{
printf("0x%06x: ", i);
}
/* print hex data */
if(i < len)
{
printf("%02x ", 0xFF & ((char*)mem)[i]);
}
else /* end of block, just aligning for ASCII dump */
{
printf(" ");
}
/* print ASCII dump */
if(i % HEXDUMP_COLS == (HEXDUMP_COLS - 1))
{
for(j = i - (HEXDUMP_COLS - 1); j <= i; j++)
{
if(j >= len) /* end of block, not really printing */
{
putchar(' ');
}
else if(isprint(((char*)mem)[j])) /* printable char */
{
putchar(0xFF & ((char*)mem)[j]);
}
else /* other char */
{
putchar('.');
}
}
putchar('\n');
}
}
}
int main(void)
{
int i;
if (iopl(3))
{
err(1, "set iopl unsuccessfully\n");
return -1;
}
static char buf[0x40];
/* test ioport 0x40,0x41,0x42,0x43,0x44,0x45 */
memset(buf, 0xab, sizeof(buf));
asm volatile("push %rdi;");
asm volatile("mov %0, %%rdi;"::"q"(buf));
asm volatile ("mov $0x40, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x41, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x42, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x43, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x44, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("mov $0x45, %rdx;");
asm volatile ("in %dx,%al;");
asm volatile ("stosb;");
asm volatile ("pop %rdi;");
hexdump(buf, 0x40);
printf("\n");
/* ins port 0x40 */
memset(buf, 0xab, sizeof(buf));
asm volatile("push %rdi;");
asm volatile("mov %0, %%rdi;"::"q"(buf));
asm volatile ("mov $0x20, %rcx;");
asm volatile ("mov $0x40, %rdx;");
asm volatile ("rep insb;");
asm volatile ("pop %rdi;");
hexdump(buf, 0x40);
printf("\n");
/* ins port 0x43 */
memset(buf, 0xab, sizeof(buf));
asm volatile("push %rdi;");
asm volatile("mov %0, %%rdi;"::"q"(buf));
asm volatile ("mov $0x20, %rcx;");
asm volatile ("mov $0x43, %rdx;");
asm volatile ("rep insb;");
asm volatile ("pop %rdi;");
hexdump(buf, 0x40);
printf("\n");
return 0;
}
The vcpu->arch.pio_data buffer is used by both in/out instrutions emulation
w/o clear after using which results in some random datas are left over in
the buffer. Guest reads port 0x43 will be ignored since it is write only,
however, the function kernel_pio() can't distigush this ignore from successfully
reads data from device's ioport. There is no new data fill the buffer from
port 0x43, however, emulator_pio_in_emulated() will copy the stale data in
the buffer to the guest unconditionally. This patch fixes it by clearing the
buffer before in instruction emulation to avoid to grant guest the stale data
in the buffer.
In addition, string I/O is not supported for in kernel device. So there is no
iteration to read ioport %RCX times for string I/O. The function kernel_pio()
just reads one round, and then copy the io size * %RCX to the guest unconditionally,
actually it copies the one round ioport data w/ other random datas which are left
over in the vcpu->arch.pio_data buffer to the guest. This patch fixes it by
introducing the string I/O support for in kernel device in order to grant the right
ioport datas to the guest.
Before the patch:
0x000000: fe 38 93 93 ff ff ab ab .8......
0x000008: ab ab ab ab ab ab ab ab ........
0x000010: ab ab ab ab ab ab ab ab ........
0x000018: ab ab ab ab ab ab ab ab ........
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
0x000000: f6 00 00 00 00 00 00 00 ........
0x000008: 00 00 00 00 00 00 00 00 ........
0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
0x000018: 30 30 20 33 20 20 20 20 00 3
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
0x000000: f6 00 00 00 00 00 00 00 ........
0x000008: 00 00 00 00 00 00 00 00 ........
0x000010: 00 00 00 00 4d 51 30 30 ....MQ00
0x000018: 30 30 20 33 20 20 20 20 00 3
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
After the patch:
0x000000: 1e 02 f8 00 ff ff ab ab ........
0x000008: ab ab ab ab ab ab ab ab ........
0x000010: ab ab ab ab ab ab ab ab ........
0x000018: ab ab ab ab ab ab ab ab ........
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
0x000000: d2 e2 d2 df d2 db d2 d7 ........
0x000008: d2 d3 d2 cf d2 cb d2 c7 ........
0x000010: d2 c4 d2 c0 d2 bc d2 b8 ........
0x000018: d2 b4 d2 b0 d2 ac d2 a8 ........
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
0x000000: 00 00 00 00 00 00 00 00 ........
0x000008: 00 00 00 00 00 00 00 00 ........
0x000010: 00 00 00 00 00 00 00 00 ........
0x000018: 00 00 00 00 00 00 00 00 ........
0x000020: ab ab ab ab ab ab ab ab ........
0x000028: ab ab ab ab ab ab ab ab ........
0x000030: ab ab ab ab ab ab ab ab ........
0x000038: ab ab ab ab ab ab ab ab ........
Reported-by: Moguofang <moguofang@huawei.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Moguofang <moguofang@huawei.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/2809
caller is __this_cpu_preempt_check+0x13/0x20
CPU: 2 PID: 2809 Comm: qemu-system-x86 Not tainted 4.11.0+ #13
Call Trace:
dump_stack+0x99/0xce
check_preemption_disabled+0xf5/0x100
__this_cpu_preempt_check+0x13/0x20
get_kvmclock_ns+0x6f/0x110 [kvm]
get_time_ref_counter+0x5d/0x80 [kvm]
kvm_hv_process_stimers+0x2a1/0x8a0 [kvm]
? kvm_hv_process_stimers+0x2a1/0x8a0 [kvm]
? kvm_arch_vcpu_ioctl_run+0xac9/0x1ce0 [kvm]
kvm_arch_vcpu_ioctl_run+0x5bf/0x1ce0 [kvm]
kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
? __fget+0xf3/0x210
do_vfs_ioctl+0xa4/0x700
? __fget+0x114/0x210
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x23/0xc2
RIP: 0033:0x7f9d164ed357
? __this_cpu_preempt_check+0x13/0x20
This can be reproduced by run kvm-unit-tests/hyperv_stimer.flat w/
CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT enabled.
Safe access to per-CPU data requires a couple of constraints, though: the
thread working with the data cannot be preempted and it cannot be migrated
while it manipulates per-CPU variables. If the thread is preempted, the
thread that replaces it could try to work with the same variables; migration
to another CPU could also cause confusion. However there is no preemption
disable when reads host per-CPU tsc rate to calculate the current kvmclock
timestamp.
This patch fixes it by utilizing get_cpu/put_cpu pair to guarantee both
__this_cpu_read() and rdtsc() are not preempted.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The way we handle include paths for DT has changed a bit, which
broke a file that had an unconventional way to reference a common
header file:
arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts:47:10: fatal error: include/dt-bindings/input/linux-event-codes.h: No such file or directory
This removes the leading "include/" from the path name, which fixes it.
Fixes: d5d332d3f7 ("devicetree: Move include prefixes from arch to separate directory")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
for various devices. Also included is a memory controller (GPMC) debug output
fix as without that the shown bootloader configured GPMC bus width will
be wrong and won't work for kernel timings:
- Add dra7 powerhold configuration to be able to shut down pmic correctly
- Fix polarity for gta04 mcbsp4 clocks for modem
- Fix Pandaboard CEC pin pull making it usable
- Fix LogicPD Torpedo camera pin mux
- Fix GPMC debug bus width
- Reduce cpu thermal shutdown temperature
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAlkdw8ARHHRvbnlAYXRv
bWlkZS5jb20ACgkQG9Q+yVyrpXOERBAA0y2DvjgvpYL+AefHlaoeN8OW6PL/GB29
sCZglf3Tc7pdmqJQXFbj1UqqPHCw3tnSAbfYwcOdiFr8g2ni0A9s9HjKoIZQzphw
s+llq282YdTi+pWxjQ79OvkxhTJZnKX0G8Ls31klmHT/4/vIGsiTuYodum3FN0n2
/uFGDGBljjMGyE69klJ6LD13VXcXSe0ESK5iIV/naVMbdzLFUW9kAxMz6IL3TO48
gP/wOpJmksq3l+vFUc/XOrNX3rIFoS/coegYUM5lkFNHtIiooIbvWrTnc6Ub/0oV
XXEoijIUSralcouYyMLdTco+ag/qBtd4xpHdEYGFg04HQrvI7iEF6OAW2MAgjW5z
D9X5CMqkPaMRf6X2/LaVQ6sqrcw2t75lW6sRX/y9N2fH4WtchYyhpMVhw+OH241x
LNLmV7HaNkE0AaWSjHtYUA1HFhMUWHqQAPqy5GWfxoJvzx7HofG1txiLRF9rUzBt
OZ30uWUpVi3koJBVG12Rz+UhN90H4/qK+kwDcQOOEDNd1kQcT/QNFew3fGBNqkDL
PivoXWAJnbEJQ6Xr+ablemE7hyyBHHJxDAQsOLh4VitxEPw1TYh4/widUi52/0VQ
PMUdbIBhQJp6St7TtUWW8CckdzbjsAMk0zJgcZeBXmczfTjH8PMOuxLvATUweEf+
JD+zCBBfXFU=
=My0U
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.12/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omaps for v4.12-rc cycle most consisting of few minor dts fixes
for various devices. Also included is a memory controller (GPMC) debug output
fix as without that the shown bootloader configured GPMC bus width will
be wrong and won't work for kernel timings:
- Add dra7 powerhold configuration to be able to shut down pmic correctly
- Fix polarity for gta04 mcbsp4 clocks for modem
- Fix Pandaboard CEC pin pull making it usable
- Fix LogicPD Torpedo camera pin mux
- Fix GPMC debug bus width
- Reduce cpu thermal shutdown temperature
* tag 'omap-for-v4.12/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: dra7: Reduce cpu thermal shutdown temperature
memory: omap-gpmc: Fix debug output for access width
ARM: dts: LogicPD Torpedo: Fix camera pin mux
ARM: dts: omap4: enable CEC pin for Pandaboard A4 and ES
ARM: dts: gta04: fix polarity of clocks for mcbsp4
ARM: dts: dra7: Add power hold and power controller properties to palmas
Signed-off-by: Olof Johansson <olof@lixom.net>
- A fix on GPCv2 power domain driver Kconfig which causes a build
failure when CONFIG_PM is not set.
- Pull down PMIC IRQ pin for imx53-qsrb board to prevent spurious
PMIC interrupts from happening.
- Remove board level OPP override for imx6sx-sdb to fix a boot crash
seen on Rev.C boards.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJZHav8AAoJEFBXWFqHsHzOPWMH/RP4LuJDUoFaPUJ3WsrMhPYR
qmBEzZXvjKBxImFCIiG09vcjQIiN9p5uZ9AqNIQGgoW20DlFKVZ0Rd64MJzp8QnL
J/WwEDTG1qs2jCkNWYkFEUwz6fMWzzTUydj665dkQI3SNwRAJ9dE0LgybM6Yws3C
FQdIUyWR2z8QQYqjLTD9XuDe7ewyoslvFssWRiVzpcnkiu8sWgbTuJ/K+ISCgwo4
rC6Vi7Z4WU47SgL7+d9sOMzDPDIif5mOdfW8uG3tgCeDp/8WNz9IbamCuxhggQpa
+k9jaTHAtOV9ypf1PBUCGHazU7E8hc42XoW6XT0yY3sFynw+VPd1IFVXiWAEBu0=
=gQA8
-----END PGP SIGNATURE-----
Merge tag 'imx-fixes-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
i.MX fixes for 4.12:
- A fix on GPCv2 power domain driver Kconfig which causes a build
failure when CONFIG_PM is not set.
- Pull down PMIC IRQ pin for imx53-qsrb board to prevent spurious
PMIC interrupts from happening.
- Remove board level OPP override for imx6sx-sdb to fix a boot crash
seen on Rev.C boards.
* tag 'imx-fixes-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx: add PM dependency for IMX7_PM_DOMAINS
ARM: dts: imx6sx-sdb: Remove OPP override
ARM: dts: imx53-qsrb: Pulldown PMIC IRQ pin
Signed-off-by: Olof Johansson <olof@lixom.net>
Enable Qualcomm drivers needed to boot Dragonboard 410c with HDMI. This
enables support for clocks, regulators, and USB PHY.
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
[Olof: Turned off _RPM configs per follow-up email]
Signed-off-by: Olof Johansson <olof@lixom.net>
Sync the defconfig with savedefconfig as config options change/move over
time.
Generated with the following commands:
make defconfig
make savedefconfig
cp defconfig arch/arm64/configs/defconfig
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
It makes sense to have a stripped-down defconfig for just Gemini, as
it is a pretty small platform used in NAS etc, and will use appended
device tree. It is also quick to compile and test. Hopefully this
defconfig can be a good base for distributions such as OpenWRT.
I plan to add in the config options needed for the different
variants of Gemini as we go along.
Cc: Janos Laube <janos.dev@gmail.com>
Cc: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
4.12, please pull the following:
- Baruch provides several fixes for the Raspberry Pi (BCM2835) Device
Tree source include file: uart0 pinctrl node names, pin number for
i2c0, uart0 rts/cts pins and invalid uart1 pin, missing numbers for
ethernet aliases
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJZF8+xAAoJEIfQlpxEBwcERmUP/iAf5euX/RPb3eJ8QmDhn2ap
bnU4Puh5aT16fwqK/fohpzqcxhlkqsPhIvl6gwi6UHzNbZOqhz2gwXBjP7DuDsPx
rNF7qJE0zFCy/jmhiSbMWiC62Q7UbIR+TmtP3cFzXw8edM6Tjp+IFkE8fuSmQkdd
/djp+3k16KNRsSu+iyNDJSu4jNjigKP1DYPFJi0cMceQZ2Hr6KHO7r2mHXmZ8Cf7
cxIPw38ZRiwIU6gbyPrVSLjJJ1X3O3QAyqlTbeooJn5bZ6cEc7jt2LlUUoLY2JTQ
+2HoT2DsrDX8SfkJmyBr41/awsHq2wx+++zss3qd9LHKqRyKBZgY6HVea05+z1T9
+FtWZ74zmOzHrjqwB5a3fN4+EfY4HKqSDVnrgbNu9Y59LCZPeVSLcFK1rpuLnu62
ZaeTTOmwhwzetOGi96+2LhnMihFbGoB5te+y8nZtf9vAONp6nLsDgQ30gqGC6/19
JeR+5z+z++7TpsghLJ+YlE4HjxMRV5zFYTLmHHI76DJOCb8i48wMURDizPHZYuaq
+aRiBJkmweZtu1KJea867UIDblgk2K9QosL35lomrD518VeTIgSrmBrux7fwlMC/
r8Z0B3z632gNYi+ZrR/Fr+9lmHpq7NBqEai5BAwI1DkwW3HfdeSAczZ5Tij42eGA
U2qLiZSH7U/8cb6mCp0H
=6jUO
-----END PGP SIGNATURE-----
Merge tag 'arm-soc/for-4.12/devicetree-fixes' of http://github.com/Broadcom/stblinux into fixes
This pull request contains Broadcom ARM-based SoC Device Tree fixes for
4.12, please pull the following:
- Baruch provides several fixes for the Raspberry Pi (BCM2835) Device
Tree source include file: uart0 pinctrl node names, pin number for
i2c0, uart0 rts/cts pins and invalid uart1 pin, missing numbers for
ethernet aliases
* tag 'arm-soc/for-4.12/devicetree-fixes' of http://github.com/Broadcom/stblinux:
ARM: dts: bcm2835: add index to the ethernet alias
ARM: dts: bcm2835: fix uart0/uart1 pins
ARM: dts: bcm2835: fix i2c0 pins
ARM: dts: bcm2835: fix uart0 pinctrl node names
Signed-off-by: Olof Johansson <olof@lixom.net>
We use a directory under arch/$ARCH/boot/dts as an include path
that has links outside of the subtree to find dt-bindings from under
include/dt-bindings. That's been working well, but new DT architectures
haven't been adding them by default.
Recently there's been a desire to share some of the DT material between
arm and arm64, which originally caused developers to create symlinks or
relative includes between the subtrees. This isn't ideal -- it breaks
if the DT files aren't stored in the exact same hierarchy as the kernel
tree, and generally it's just icky.
As a somewhat cleaner solution we decided to add a $ARCH/ prefix link
once, and allow DTS files to reference dtsi (and dts) files in other
architectures that way.
Original approach was to create these links under each architecture,
but it lead to the problem of recursive symlinks.
As a remedy, move the include link directories out of the architecture
trees into a common location. At the same time, they can now share one
directory and one dt-bindings/ link as well.
Fixes: 4027494ae6 ('ARM: dts: add arm/arm64 include symlinks')
Reported-by: Russell King <linux@armlinux.org.uk>
Reported-by: Omar Sandoval <osandov@osandov.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: linux-arch <linux-arch@vger.kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZF2pwAAoJEHm+PkMAQRiG9aAIAJJyV6Ux9kaX+glqO3KIs0wm
0K/yqMOv1JTfJ1UUgY4iZbk5XOPPmXv1bdKJFECZfuAHdymJUF/RVNNvDlZbaLdd
K8vDEi92eRwcf07a5b/Q2F8yNfADKKmRAA/oAbuQLBhJ0dPHig70PIvi9gq9kqiE
Ft1MinbsZLavYatLm7oVDr/nsYebEDMGwTy0EX5bF2YjydfAlCvVWnI5ld5wisiV
0fQF4W7MMjjcpAzG8uq3atEB8iQcWS2Ykz2chZRbYzHcdV2WJW751Vge9xc05Hzi
rxlqn6peZFiFyM0qdPLhY0ktGzSTZcCFeb3aZicvm5aOamy2KJjOSZrEwjU8kts=
=VHpx
-----END PGP SIGNATURE-----
Merge tag 'v4.12-rc1' into fixes
We've received a few fixes branches with -rc1 as base, but our contents was
still at pre-rc1. Merge it in expliticly to make 'git merge --log' clear on
hat was actually merged.
Signed-off-by: Olof Johansson <olof@lixom.net>
xen_flush_tlb_all() is used in arch/x86/xen/mmu.c only. Make it static.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
There are some leftovers testing for pvh guest mode in pv-only source
files. Remove them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
virt_addr_valid() is supposed to tell you if it's OK to call virt_to_page() on
an address. What this means in practice is that it should only return true for
addresses in the linear mapping which are backed by a valid PFN.
We are failing to properly check that the address is in the linear mapping,
because virt_to_pfn() will return a valid looking PFN for more or less any
address. That bug is actually caused by __pa(), used in virt_to_pfn().
eg: __pa(0xc000000000010000) = 0x10000 # Good
__pa(0xd000000000010000) = 0x10000 # Bad!
__pa(0x0000000000010000) = 0x10000 # Bad!
This started happening after commit bdbc29c19b ("powerpc: Work around gcc
miscompilation of __pa() on 64-bit") (Aug 2013), where we changed the definition
of __pa() to work around a GCC bug. Prior to that we subtracted PAGE_OFFSET from
the value passed to __pa(), meaning __pa() of a 0xd or 0x0 address would give
you something bogus back.
Until we can verify if that GCC bug is no longer an issue, or come up with
another solution, this commit does the minimal fix to make virt_addr_valid()
work, by explicitly checking that the address is in the linear mapping region.
Fixes: bdbc29c19b ("powerpc: Work around gcc miscompilation of __pa() on 64-bit")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Tested-by: Breno Leitao <breno.leitao@gmail.com>
Smatch complains that we check cap the upper bound of "index" but don't
check for negatives. It's a false positive because "index" is never
negative. But it's also simple enough to make it unsigned which makes
the code easier to audit.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Includes:
- A fix for a build failure introduced in -rc1 when tracepoints are
enabled on 32-bit ARM.
- Disabling use of stack pointer protection in the hyp code which can
cause panics.
- A handful of VGIC fixes.
- A fix to the init of the redistributors on GICv3 systems that
prevented boot with kvmtool on GICv3 systems introduced in -rc1.
- A number of race conditions fixed in our MMU handling code.
- A fix for the guest being able to program the debug extensions for
the host on the 32-bit side.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJZHWzeAAoJEEtpOizt6ddyFO0H/jpgdDvmu8mL+sk4xdjCxsqE
sl0uH8vnlziTmIFtRlXKtB1ecL+waj22YEoVLIA3lXz92uZXF3RDoFqVW3NGnpbO
pFyUs8ZHfhyo3nHI7DZqT+/SeButwwX+cw02tRpaEOPhlun7BlEZEco26r2y/2xR
WZdTDYYkAjTtNtY1dJ7xzNrhSJZXpf54rvQshYSbqn+gVdGVHaosQygMPohU8tOF
V0AX3gQ3pnR3hEgE3Cz+F3TGhORg9buOADS28CdvDCx6ekJdy55Snf6AHFoZ9cdT
n4YZpon2HO1ZIxpMLvg1Ud+M3bQxeYLBWj2av3PAHsMFoR23x4xTH6YVA53sfFY=
=SJeq
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/ARM Fixes for v4.12-rc2.
Includes:
- A fix for a build failure introduced in -rc1 when tracepoints are
enabled on 32-bit ARM.
- Disabling use of stack pointer protection in the hyp code which can
cause panics.
- A handful of VGIC fixes.
- A fix to the init of the redistributors on GICv3 systems that
prevented boot with kvmtool on GICv3 systems introduced in -rc1.
- A number of race conditions fixed in our MMU handling code.
- A fix for the guest being able to program the debug extensions for
the host on the 32-bit side.
The ftrace function_graph time measurements of a given function is not
accurate according to those recorded by ftrace using the function
filters. This change pulls the x86_64 fix from 'commit 722b3c7469
("ftrace/graph: Trace function entry before updating index")' into the
sparc specific prepare_ftrace_return which stops ftrace from
counting interrupted tasks in the time measurement.
Example measurements for select_task_rq_fair running "hackbench 100
process 1000":
| tracing/trace_stat/function0 | function_graph
Before patch | 2.802 us | 4.255 us
After patch | 2.749 us | 3.094 us
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greetings,
GCC 7 introduced the -Wstringop-overflow flag to detect buffer overflows
in calls to string handling functions [1][2]. Due to the way
``empty_zero_page'' is declared in arch/sparc/include/setup.h, this
causes a warning to trigger at compile time in the function mem_init(),
which is subsequently converted to an error. The ensuing patch fixes
this issue and aligns the declaration of empty_zero_page to that of
other architectures. Thank you.
Cheers,
Orlando.
[1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02308.html
[2] https://gcc.gnu.org/gcc-7/changes.html
Signed-off-by: Orlando Arias <oarias@knights.ucf.edu>
--------------------------------------------------------------------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
An incorrect huge page alignment check caused
mmap failure for 64K pages when MAP_FIXED is used
with address not aligned to HPAGE_SIZE.
Orabug: 25885991
Fixes: dcd1912d21 ("sparc64: Add 64K page size support")
Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
must take the jump_label mutex.
We call cpus_set_cap() in the secondary bringup path, from the idle
thread where interrupts are disabled. Taking a mutex in this path "is a
NONO" regardless of whether it's contended, and something we must avoid.
We didn't spot this until recently, as ___might_sleep() won't warn for
this case until all CPUs have been brought up.
This patch avoids taking the mutex in the secondary bringup path. The
poking of static keys is deferred until enable_cpu_capabilities(), which
runs in a suitable context on the boot CPU. To account for the static
keys being set later, cpus_have_const_cap() is updated to use another
static key to check whether the const cap keys have been initialised,
falling back to the caps bitmap until this is the case.
This means that users of cpus_have_const_cap() gain should only gain a
single additional NOP in the fast path once the const caps are
initialised, but should always see the current cap value.
The hyp code should never dereference the caps array, since the caps are
initialized before we run the module initcall to initialise hyp. A check
is added to the hyp init code to document this requirement.
This change will sidestep a number of issues when the upcoming hotplug
locking rework is merged.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyniger <marc.zyngier@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On dra7, as per TRM, the HW shutdown (TSHUT) temperature is hardcoded
to 123C and cannot be modified by SW. This means when the temperature
reaches 123C HW asserts TSHUT output which signals a warm reset.
This reset is held until the temperature goes below the TSHUT low (105C).
While in SW, the thermal driver continuously monitors current temperature
and takes decisions based on whether it reached an alert or a critical point.
The intention of setting a SW critical point is to prevent force reset by HW
and instead do an orderly_poweroff(). But if the SW critical temperature is
greater than or equal to that of HW then it defeats the purpose. To address
this and let SW take action before HW does keep the SW critical temperature
less than HW TSHUT value.
The value for SW critical temperature was chosen as 120C just to ensure
we give SW sometime before HW catches up.
Document reference
SPRUI30C – DRA75x, DRA74x Technical Reference Manual - November 2016
SPRUHZ6H - AM572x Technical Reference Manual - November 2016
Tested on:
DRA75x PG 2.0 Rev H EVM
Signed-off-by: Ravikumar Kattekola <rk@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The page table dump code doesn't know about huge pages, so currently
it crashes (or walks random memory, usually leading to a crash), if it
finds a huge page. On Book3S we only see huge pages in the Linux page
tables when we're using the P9 Radix MMU.
Teaching the code to properly handle huge pages is a bit more involved,
so for now just prevent the crash.
Cc: stable@vger.kernel.org # v4.10+
Fixes: 8eb07b1870 ("powerpc/mm: Dump linux pagetables")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>