Pull x86 pti fixes from Thomas Gleixner:
"A set of updates for the x86/pti related code:
- Preserve r8-r11 in int $0x80. r8-r11 need to be preserved, but the
int$80 entry code removed that quite some time ago. Make it correct
again.
- A set of fixes for the Global Bit work which went into 4.17 and
caused a bunch of interesting regressions:
- Triggering a BUG in the page attribute code due to a missing
check for early boot stage
- Warnings in the page attribute code about holes in the kernel
text mapping which are caused by the freeing of the init code.
Handle such holes gracefully.
- Reduce the amount of kernel memory which is set global to the
actual text and do not incidentally overlap with data.
- Disable the global bit when RANDSTRUCT is enabled as it
partially defeats the hardening.
- Make the page protection setup correct for vma->page_prot
population again. The adjustment of the protections fell through
the crack during the Global bit rework and triggers warnings on
machines which do not support certain features, e.g. NX"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry/64/compat: Preserve r8-r11 in int $0x80
x86/pti: Filter at vma->vm_page_prot population
x86/pti: Disallow global kernel text with RANDSTRUCT
x86/pti: Reduce amount of kernel text allowed to be Global
x86/pti: Fix boot warning from Global-bit setting
x86/pti: Fix boot problems from Global-bit setting
Pull perf fixes from Thomas Gleixner:
"The perf update contains the following bits:
x86:
- Prevent setting freeze_on_smi on PerfMon V1 CPUs to avoid #GP
perf stat:
- Keep the '/' event modifier separator in fallback, for example when
fallbacking from 'cpu/cpu-cycles/' to user level only, where it
should become 'cpu/cpu-cycles/u' and not 'cpu/cpu-cycles/:u' (Jiri
Olsa)
- Fix PMU events parsing rule, improving error reporting for invalid
events (Jiri Olsa)
- Disable write_backward and other event attributes for !group events
in a group, fixing, for instance this group: '{cycles,msr/aperf/}:S'
that has leader sampling (:S) and where just the 'cycles', the
leader event, should have the write_backward attribute set, in this
case it all fails because the PMU where 'msr/aperf/' lives doesn't
accepts write_backward style sampling (Jiri Olsa)
- Only fall back group read for leader (Kan Liang)
- Fix core PMU alias list for x86 platform (Kan Liang)
- Print out hint for mixed PMU group error (Kan Liang)
- Fix duplicate PMU name for interval print (Kan Liang)
Core:
- Set main kernel end address properly when reading kernel and module
maps (Namhyung Kim)
perf mem:
- Fix incorrect entries and add missing man options (Sangwon Hong)
s/390:
- Remove s390 specific strcmp_cpuid_cmp function (Thomas Richter)
- Adapt 'perf test' case record+probe_libc_inet_pton.sh for s390
- Fix s390 undefined record__auxtrace_init() return value in 'perf
record' (Thomas Richter)"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1
perf stat: Fix duplicate PMU name for interval print
perf evsel: Only fall back group read for leader
perf stat: Print out hint for mixed PMU group error
perf pmu: Fix core PMU alias list for X86 platform
perf record: Fix s390 undefined record__auxtrace_init() return value
perf mem: Document incorrect and missing options
perf evsel: Disable write_backward for leader sampling group events
perf pmu: Fix pmu events parsing rule
perf stat: Keep the / modifier separator in fallback
perf test: Adapt test case record+probe_libc_inet_pton.sh for s390
perf list: Remove s390 specific strcmp_cpuid_cmp function
perf machine: Set main kernel end address properly
A bunch of fixes, mostly for existing code and going to stable.
Our memory hot-unplug path wasn't flushing the cache before removing memory.
That is a problem now that we are doing memory hotplug on bare metal.
Three fixes for the NPU code that supports devices connected via NVLink (ie.
GPUs). The main one tweaks the TLB flush algorithm to avoid soft lockups for
large flushes.
A fix for our memory error handling where we would loop infinitely, returning
back to the bad access and hard lockup the CPU.
Fixes for the OPAL RTC driver, which wasn't handling some error cases correctly.
A fix for a hardlockup in the powernv cpufreq driver.
And finally two fixes to our smp_send_stop(), required due to a recent change to
use it on shutdown.
Thanks to:
Alistair Popple, Balbir Singh, Laurentiu Tudor, Mahesh Salgaonkar, Mark
Hairgrove, Nicholas Piggin, Rashmica Gupta, Shilpasri G Bhat.
-----BEGIN PGP SIGNATURE-----
iQIwBAABCAAaBQJa5FRaExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYA3
LQ//es8gvVVYxXOP5m+jl+LP//nQ8Z9l4ezW/0QmtAwuzAnt31F3eYcBwtIa5EaZ
Fm7iQ5eu+o4JJSj7y/a1gXZOgZaG1uprc6psUdI+FZ6rQ3AAF9BlD7J5ZvkJ/Nuz
Wo37+oxr8T8dpGYurS2nrOyP1654ZNvtkHzr1rovhNZ/Yx6GuDppyou1cBrcHgoQ
f/SILBDpwPQ6sEzMOPptN3SNajq2716kgoTT9yU2lEHGReeMPc1RL1gVw91O7jdA
RJGZl/GTPDDuT2hg0yms4eWhmMDbfQU6kRbPwBtYM5BsCvvBGuISL3RKSceNSo/C
LO3IqnirNff0zzx5dSuy+cmzoPxMbDhWV91to29HJH5cyvWCqH8V5uJsKeHnDbmr
YscSvgi6iEbiMtuckYL8Bqe/jcE/4RCRixH+j7mkJc+XUrvjligUFG9VVq8tERXF
lA/M0Zh+AI0doFjiPbkWHlbcfPu0jhwnZ7aivpf5FKdcfF6aeBr5tX+j0bRqAXEZ
FVUd2gst7s73q4B8b8QicfMpJkYfWia9PnrifrHe10EYi9kL2z5GjDOz8s6Suzed
KD+XGuLWb9zm2Fuga/Guzx2YM0DWTEk/or5qbBRh+44WTprEZxDTotVl5tTYfgsU
ErEnGqlBevCrzknbe7ZaWKlkzSNXxoF9OpETf8kVOocEuWs=
=JJLB
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"A bunch of fixes, mostly for existing code and going to stable.
Our memory hot-unplug path wasn't flushing the cache before removing
memory. That is a problem now that we are doing memory hotplug on bare
metal.
Three fixes for the NPU code that supports devices connected via
NVLink (ie. GPUs). The main one tweaks the TLB flush algorithm to
avoid soft lockups for large flushes.
A fix for our memory error handling where we would loop infinitely,
returning back to the bad access and hard lockup the CPU.
Fixes for the OPAL RTC driver, which wasn't handling some error cases
correctly.
A fix for a hardlockup in the powernv cpufreq driver.
And finally two fixes to our smp_send_stop(), required due to a recent
change to use it on shutdown.
Thanks to: Alistair Popple, Balbir Singh, Laurentiu Tudor, Mahesh
Salgaonkar, Mark Hairgrove, Nicholas Piggin, Rashmica Gupta, Shilpasri
G Bhat"
* tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/kvm/booke: Fix altivec related build break
powerpc: Fix deadlock with multiple calls to smp_send_stop
cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interrupt
powerpc: Fix smp_send_stop NMI IPI handling
rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops
powerpc/mce: Fix a bug where mce loops on memory UE.
powerpc/powernv/npu: Do a PID GPU TLB flush when invalidating a large address range
powerpc/powernv/npu: Prevent overwriting of pnv_npu2_init_contex() callback parameters
powerpc/powernv/npu: Add lock to prevent race in concurrent context init/destroy
powerpc/powernv/memtrace: Let the arch hotunplug code flush cache
powerpc/mm: Flush cache on memory hot(un)plug
ARM:
- PSCI selection API, a leftover from 4.16 (for stable)
- Kick vcpu on active interrupt affinity change
- Plug a VMID allocation race on oversubscribed systems
- Silence debug messages
- Update Christoffer's email address (linaro -> arm)
x86:
- Expose userspace-relevant bits of a newly added feature
- Fix TLB flushing on VMX with VPID, but without EPT
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJa44lQAAoJEED/6hsPKofo1dIH/3n9AZSWvavgL2V3j6agT8Yy
hxF4nHCFEJd5aqDNwbG9QEzivKw88r3o3mdB2XAQESB2MlCYR1jkTONm7yvVJTs/
/P9gj+DEQbCj2AgT//u3BGsAsZDKFhB9JwfmV2Mp4zDIqWFa6oCOGeq/iPVAGDcN
vUpuYeIicuH9SRoxH7de3z+BEXW0O+gCABXQtvA93FKTMz35yFTgmbDVCnvaV0zL
3B+3/4/jdbTRICW8EX6Li43+gEBUMtnVNkdqxLPTuCtDG8iuPUGfgF02gH99/9gj
hliV3Q4VUZKkSABW5AqKPe4+9rbsHCh9eL0LpHFGI9y+6LeUIOXAX4CtohR8gWE=
=W9Vz
-----END PGP SIGNATURE-----
rMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"ARM:
- PSCI selection API, a leftover from 4.16 (for stable)
- Kick vcpu on active interrupt affinity change
- Plug a VMID allocation race on oversubscribed systems
- Silence debug messages
- Update Christoffer's email address (linaro -> arm)
x86:
- Expose userspace-relevant bits of a newly added feature
- Fix TLB flushing on VMX with VPID, but without EPT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86/headers/UAPI: Move DISABLE_EXITS KVM capability bits to the UAPI
kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in use
arm/arm64: KVM: Add PSCI version selection API
KVM: arm/arm64: vgic: Kick new VCPU on interrupt migration
arm64: KVM: Demote SVE and LORegion warnings to debug only
MAINTAINERS: Update e-mail address for Christoffer Dall
KVM: arm/arm64: Close VMID generation race
- Close some potential spectre-v1 vulnerabilities found by smatch
- Add missing list sentinel for CPUs that don't require KPTI
- Removal of unused 'addr' parameter for I/D cache coherency
- Removal of redundant set_fs(KERNEL_DS) calls in ptrace
- Fix single-stepping state machine handling in response to kernel traps
- Clang support for 128-bit integers
- Avoid instrumenting our out-of-line atomics in preparation for enabling
LSE atomics by default in 4.18
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJa4w6JAAoJELescNyEwWM0P2IIAMLITiYvB+LEwWH6VZ5zl+D0
F1qoQPon6M68fSc86rNNwoOrLzisHPTMMyR3re5+rHe67EwHCMtupkNk3s/+/vi3
PVq3W2Rjw9GTFL/7sDNmaHvJLQ3lG1HAh4uO2WneLbLV6wkbw7/JlmCcwlS48zB0
zxY5fKnZNPCAfAT34TYZGMHINy5rOoo7+H3+/ZB/f4jc3FIatfnsUb3+Mr5B/lZ9
HoOddh9PEt+CY2v5Yr2M6FJuu/oaZdX+KaAUlynd44jyF+XgB5BxXTEHoD4bEO9l
q8CzjqzUqqBn8qSF36r/gdffH4eAKkrFgMCxjdEbPX1cOj67fTquNALBmAhAA7M=
=CIk+
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Nothing too bad, but the spectre updates to smatch identified a few
places that may need sanitising so we've got those covered.
Details:
- Close some potential spectre-v1 vulnerabilities found by smatch
- Add missing list sentinel for CPUs that don't require KPTI
- Removal of unused 'addr' parameter for I/D cache coherency
- Removal of redundant set_fs(KERNEL_DS) calls in ptrace
- Fix single-stepping state machine handling in response to kernel
traps
- Clang support for 128-bit integers
- Avoid instrumenting our out-of-line atomics in preparation for
enabling LSE atomics by default in 4.18"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: avoid instrumenting atomic_ll_sc.o
KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_mmio_read_apr()
KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_get_irq()
arm64: fix possible spectre-v1 in ptrace_hbp_get_event()
arm64: support __int128 with clang
arm64: only advance singlestep for user instruction traps
arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrp
arm64: ptrace: remove addr_limit manipulation
arm64: mm: drop addr parameter from sync icache and dcache
arm64: add sentinel to kpti_safe_list
Move DISABLE_EXITS KVM capability bits to the UAPI just like the rest of
capabilities.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
This round of fixes has two larger changes that came in last week:
- A set of a couple of patches all intended to finally turn on
USB support on various Amlogic SoC based boards. The respective
driver were not finalized until very late before the merge window
and the DT portion is the last bit now.
- A defconfig update for gemini that had repeatedly missed the
cut but that is required to actually boot any real machines
with the default build.
The rest are the usual small changes:
- A fix for a nasty build regression on the OMAP memory drivers
- A fix for a boot problem on Intel/Altera SocFPGA
- A MAINTAINER file update
- A couple of fixes for issues found by automated testing
(kernelci, coverity, sparse, ...)
- A few incorrect DT entries are updated to match the hardware
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJa4uqUAAoJEGCrR//JCVInP3IP/AoWoaUPZfnQQai9xJZnphAv
n0z24NJD7HikPN2zmZjZkjfF15aa9RCyYGcJFwVPAWl9uky/8NIR/3mu7s4fbuOR
aiVo2wjQDFA0UPdHw+4W+hDnMtlNvpxsycp13oJ3JSoZhgM9aqOki2xanYVB/l8I
Yd5dySR52DMs8rYJZ0HwQQHqnld6zhjxuKQzHDhr292rka+6y2WTzA1bcrpDcqQZ
8VRA2cIsaY703Gb/UvR3i+7j3fmlDjAVNDwECW06zohsXCCBMBwdlbnM02SLoCFy
oSRM7v6ypdh99JSASaMvWDog5feaTlTmJos0BHT+vkH5Rs0eGI7KLv5hrOcnbGCv
1OsI51B0jnbu680YyNo6XnJOGfPo3RjsoYrUTXRDxz6dnu6sp1Mj5Re/HCdmnEFI
l5LGjzlyYah7l+jGErItW4Tf/mSrboJpdrpS3f8ZxveFAyQMqIMt0I83OpPogtjN
7EWtEzw+FtCiCH7RHMP4tH5HLeLvJXSAkD2eRj622+r8L0Q9xWzFOoVhufNYYB80
Q9Fb6zJ/GQG9azDN84k19lPk/I0DgQMcjolTtBUVKre96AP3SUpR+YuAsUztpig8
CHZok8NolXzRqFSsNQiwSr0GOrKETNbgshepolHpuKZ4PTVTJcqRxvxK6sFmKmx/
BfKYsx/0iQYDSpnRF74g
=Zhll
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"This round of fixes has two larger changes that came in last week:
- a couple of patches all intended to finally turn on USB support on
various Amlogic SoC based boards. The respective driver were not
finalized until very late before the merge window and the DT
portion is the last bit now.
- a defconfig update for gemini that had repeatedly missed the cut
but that is required to actually boot any real machines with the
default build.
The rest are the usual small changes:
- a fix for a nasty build regression on the OMAP memory drivers
- a fix for a boot problem on Intel/Altera SocFPGA
- a MAINTAINER file update
- a couple of fixes for issues found by automated testing (kernelci,
coverity, sparse, ...)
- a few incorrect DT entries are updated to match the hardware"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: defconfig: Update Gemini defconfig
ARM: s3c24xx: jive: Fix some GPIO names
HISI LPC: Add Kconfig MFD_CORE dependency
ARM: dts: Fix NAS4220B pin config
MAINTAINERS: Remove myself as maintainer
arm64: dts: correct SATA addresses for Stingray
ARM64: dts: meson-gxm-khadas-vim2: enable the USB controller
ARM64: dts: meson-gxl-nexbox-a95x: enable the USB controller
ARM64: dts: meson-gxl-s905x-libretech-cc: enable the USB controller
ARM64: dts: meson-gx-p23x-q20x: enable the USB controller
ARM64: dts: meson-gxl-s905x-p212: enable the USB controller
ARM64: dts: meson-gxm: add GXM specific USB host configuration
ARM64: dts: meson-gxl: add USB host support
ARM: OMAP2+: Fix build when using split object directories
soc: bcm2835: Make !RASPBERRYPI_FIRMWARE dummies return failure
soc: bcm: raspberrypi-power: Fix use of __packed
ARM: dts: Fix cm2 and prm sizes for omap4
ARM: socfpga_defconfig: Remove QSPI Sector 4K size force
firmware: arm_scmi: remove redundant null check on array
arm64: dts: juno: drop unnecessary address-cells and size-cells properties
Currently, KVM flushes the TLB after a change to the APIC access page
address or the APIC mode when EPT mode is enabled. However, even in
shadow paging mode, a TLB flush is needed if VPIDs are being used, as
specified in the Intel SDM Section 29.4.5.
So replace vmx_flush_tlb_ept_only() with vmx_flush_tlb(), which will
flush if either EPT or VPIDs are in use.
Signed-off-by: Junaid Shahid <junaids@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
32-bit user code that uses int $80 doesn't care about r8-r11. There is,
however, some 64-bit user code that intentionally uses int $0x80 to invoke
32-bit system calls. From what I've seen, basically all such code assumes
that r8-r15 are all preserved, but the kernel clobbers r8-r11. Since I
doubt that there's any code that depends on int $0x80 zeroing r8-r11,
change the kernel to preserve them.
I suspect that very little user code is broken by the old clobber, since
r8-r11 are only rarely allocated by gcc, and they're clobbered by function
calls, so they only way we'd see a problem is if the same function that
invokes int $0x80 also spills something important to one of these
registers.
The current behavior seems to date back to the historical commit
"[PATCH] x86-64 merge for 2.6.4". Before that, all regs were
preserved. I can't find any explanation of why this change was made.
Update the test_syscall_vdso_32 testcase as well to verify the new
behavior, and it strengthens the test to make sure that the kernel doesn't
accidentally permute r8..r15.
Suggested-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org
Our out-of-line atomics are built with a special calling convention,
preventing pointless stack spilling, and allowing us to patch call sites
with ARMv8.1 atomic instructions.
Instrumentation inserted by the compiler may result in calls to
functions not following this special calling convention, resulting in
registers being unexpectedly clobbered, and various problems resulting
from this.
For example, if a kernel is built with KCOV and ARM64_LSE_ATOMICS, the
compiler inserts calls to __sanitizer_cov_trace_pc in the prologues of
the atomic functions. This has been observed to result in spurious
cmpxchg failures, leading to a hang early on in the boot process.
This patch avoids such issues by preventing instrumentation of our
out-of-line atomics.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
for 4.17, please pull the following:
- Srinath fixes the register base address of all SATA controllers on
Stingray
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJa4nBdAAoJEIfQlpxEBwcEV2gP/3F04/7WK3UMemocmYlEWTO9
Q+umkRw1pn0d34HxOH4sIOe8CZr5g8Hh/YZvYI0Ewhhav9r7voIorRUAzVVNcnAr
DdcOBpiWLiv/0RW5DG6zr8EYmOzpopFaP0Jp08lemb0icj960rElftn4odrwKSDb
bPKdxg8+fzwxLF4eO+jcyEV3sYcFabPgvLfxOCUe0nnoBs/4fgdLtZdHfyJGhpLd
UTEDJE8lKMz9eaZ81O/rluBRhOFBkooA9bRGl1JLTlocsaWkpF72flihdpKZQ3iK
Sr24E3/nAZRZwz5T10lNqwvNBoFZr+ifSropPCAFdcXGT5EaRX/0gR+MSzObo7CZ
0SyG6255QQMWK/Mz6ViZ+092sd+xszBV3GpM0jVVZjU5g+NO4t1xdcaFFQl5fd5Y
im/vMO5qWOolBW813MjVxbNQoDFlgTRtgyyl64Hp1NM432ayoldrEs51cfqs5DMX
koWjoiobhboIkASCNg/Q1kClQOKSM0c+m5fC2z5ZN7WYjTiBxPU8vj3l8xsYRyCD
rTj06sjMzWwjOEpi0wCP4D+pmf2u8rBVOQPygh/kn12YTBoP6s6yNWBGxMIzWSP1
iVkVoUVOvCvFcdhUNRI/Hk4yumtLEcapiJ3rhC7yZL2V9LJOKeMorW3fGHwyXNhS
/DOGqNKZZPTXfcmri6X9
=oT+3
-----END PGP SIGNATURE-----
Merge tag 'arm-soc/for-4.17/devicetree-arm64-fixes' of https://github.com/Broadcom/stblinux into fixes
Pull "Broadcom devicetree-arm64 fixes for 4.17" from Florian Fainelli:
This pull request contains Broadcom ARM64-based SoCs Device Tree fixes
for 4.17, please pull the following:
- Srinath fixes the register base address of all SATA controllers on
Stingray
* tag 'arm-soc/for-4.17/devicetree-arm64-fixes' of https://github.com/Broadcom/stblinux:
arm64: dts: correct SATA addresses for Stingray
smp_send_stop can lock up the IPI path for any subsequent calls,
because the receiving CPUs spin in their handler function. This
started becoming a problem with the addition of an smp_send_stop
call in the reboot path, because panics can reboot after doing
their own smp_send_stop.
The NMI IPI variant was fixed with ac61c11566 ("powerpc: Fix
smp_send_stop NMI IPI handling"), which leaves the smp_call_function
variant.
This is fixed by having smp_send_stop only ever do the
smp_call_function once. This is a bit less robust than the NMI IPI
fix, because any other call to smp_call_function after smp_send_stop
could deadlock, but that has always been the case, and it was not
been a problem before.
Fixes: f2748bdfe1 ("powerpc/powernv: Always stop secondaries before reboot/shutdown")
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- Add workqueue forward declaration (for new work, but a nice clean up)
- seftest fixes for the new histogram code
- Print output fix for hwlat tracer
- Fix missing system call events - due to change in x86 syscall naming
- Fix kprobe address being used by perf being hashed
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCWuIMShQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qkrdAQDRrgIGcm4pRGrvPiGhp4FeQKUx3woM
LY10qMYo3St7zwEAn5oor/e/7KQaQSdKQ7QkL690QU2bTO6FXz4VwE1OcgM=
=OHJk
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Add workqueue forward declaration (for new work, but a nice clean up)
- seftest fixes for the new histogram code
- Print output fix for hwlat tracer
- Fix missing system call events - due to change in x86 syscall naming
- Fix kprobe address being used by perf being hashed
* tag 'trace-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix missing tab for hwlat_detector print format
selftests: ftrace: Add a testcase for multiple actions on trigger
selftests: ftrace: Fix trigger extended error testcase
kprobes: Fix random address output of blacklist file
tracing: Fix kernel crash while using empty filter with perf
tracing/x86: Update syscall trace events to handle new prefixed syscall func names
tracing: Add missing forward declaration
Pull s390 fixes from Martin Schwidefsky:
"A couple of bug fixes:
- correct some CPU-MF counter names for z13 and z14
- correct locking in the vfio-ccw fsm_io_helper function
- provide arch_uretprobe_is_alive to avoid sigsegv with uretprobes
- fix a corner case with CPU-MF sampling in regard to execve
- fix expoline code revert for loadable modules
- update chpid descriptor for resource accessibility events
- fix dasd I/O errors due to outdated device alias infomation"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: correct module section names for expoline code revert
vfio: ccw: process ssch with interrupts disabled
s390: update sampling tag after task pid change
s390/cpum_cf: rename IBM z13/z14 counter names
s390/dasd: fix IO error for newly defined devices
s390/uprobes: implement arch_uretprobe_is_alive()
s390/cio: update chpid descriptor after resource accessibility event
It's possible for userspace to control idx. Sanitize idx when using it
as an array index.
Found by smatch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This updates the Gemini defconfig with a config that will bring
up most of the recently merged and updated devices to some
functional level:
- We enable high resolution timers (the right thing to do)
- Enable CMA for the framebuffer, and the new TVE200
framebuffer driver and the Ilitek ILI9322 driver for
graphics on the D-Link DIR-685. HIGHMEM support comes in
as part of this.
- Enable networking and the new Cortina Gemini ethernet
driver.
- Enable MDIO over GPIO and the Realtek PHY devices used on
several of these systems.
- Enable I2C over GPIO and SPI over GPIO which is used on
several of these devices.
- Enable the Thermal framework, GPIO fan control and LM75 sensor
adding cooling on the D-Link DNS-313 with no userspace
involved even if only the kernel is working, rock solid
thermal for this platform.
- Enable JEDEC flash probing to support the Eon flash chip in
D-Link DNS-313.
- Enable LED disk triggers for the NAS type devices.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
One of the bitbanged SPI hosts had wrongly named GPIO lines due to
sloppiness by yours truly.
Cc: arm@kernel.org
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fix a build regression with split object directories reported by Russell
and fix range sizes for omap4 cm2 and prm modules.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAlraJXYRHHRvbnlAYXRv
bWlkZS5jb20ACgkQG9Q+yVyrpXNV5g//Y7bnLVOPGTu73EiB4erJr6OHlZjtzBE/
O/QQ0UwHZvmugzztPAfEvJg+s2O9IT6nloxupJHtmGpE43b7Bz47z7PAqSaI10vT
9CJ9xwmRyobkAPnYc9deQpQwmsg4pOYFjtsFTzWB/88AgadhqjRDzIjwGIM1SDvN
EKxcS+LA33erebbpgiLAIf+4IGvu+meENEHxBYIA/5KLdcYUTw0dVXSkpR301iV6
R4wW5a1nrqac8HORu+CBmehs0VI3YMJw9tMcIrWDm//ZsPVoXGP61kM6lZxlCB0S
FbOMVGO7GmcdrdhY0BaAKa7/KqSXEVBjPtZjZdOlnCDq1YNoUvrpIGn+k5x2jt1d
NI03+FaCVqAVGWQ11UywnM55aAmLDYMkY3kUG6HySJL8zKw8m0xGHVFN8JgLI1JU
ag3JlCbd7WNkAffLgUO+fobta6P0ASaxBXQ+88aOh9Yp6evuHBLVd/maC1+qNp7I
YEVw5HupVpCukPlNmSVpypH9+vfVdRcmrxGZiCoskmwoW+8JnmvPWjsvulFc1nqh
89lnz0XAMzHOTOmaK93s+kiJlZDoKJgrDs9B20Jtunur6El7oChR+f5z/AVmNfMr
zessucoRlQ2u4kYMqw/oDKoyE6bkWXhwFB5vjZaz8kXE5HGWF0HCVTnmBF79H/9B
C8Nx3K9FNyA=
=5SNP
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.17/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Pull "Two fixes for v4.17-rc cycle" from Tony Lindgren:
Fix a build regression with split object directories reported by Russell
and fix range sizes for omap4 cm2 and prm modules.
* tag 'omap-for-v4.17/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: Fix build when using split object directories
ARM: dts: Fix cm2 and prm sizes for omap4
- add / enable USB host support for GX boards
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEe4dGDhaSf6n1v/EMWTcYmtP7xmUFAlrXwgkACgkQWTcYmtP7
xmVMZhAAn6vx6OPPjI6TZaGFmgA+LL7vHij2TfHtxbtbf3REa8ef3cxuoaiRAL4L
CxJ3IUM8oNJ/r1wj5i1P+lHkO9cHn6z9mNvShMTn6+0KoyxKP8hxeTECV/8QOGpg
LhUMCtymiHWgO+4nCS6Ch4CwVQUC/LzDt+T9InKAaeMyRp8zpIc6UIF0fTdTUA0M
/kAv9VfLlybUzt9BaBwlS4w0uDc19ewl9h8ZpnUhtkFmLGq6M6netMiT7lDyixc4
VP5VMYd5MkopOZaDgm55P2OvfJ5KiVrRz7Pu8AAbu/7VE9NxyJHAmkmi9DFXIQe1
AYiOQDdbtw0JniM0KULvVlqp3biQD4XbAoGdsVFfnLmu7uUbvXA49O5bnsQQbxrC
alid56TedNnCCMbTRFV4eLGn7M22wq4SlblxLqFziDyQIOMbw0cON2xhryLumXXQ
xOTtaC272H/7viwCcV7NzNLPL6ygPVkWyi6zPrS28wr6BUR5hMDr9sJ7Q7xgbwQp
r1OnoSc6+NTKiwGAUy3cOxgNAJzNWTiAEvut/o6crfE49ZAfcNX9ivtv2rtxhWrn
yG4GF5WpZYCb3+/KlMXjZkaZKd0S2PXjh5TVHSwuBgZeCLC5zVk66iLmL8ScWI2S
NRglrzEw4yqKPpw3pBhrIbfYatc3/On1xoV+ek3QeM/jnyT6RxQ=
=dXD+
-----END PGP SIGNATURE-----
Merge tag 'amlogic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into fixes
Pull "Amlogic fixes for v4.17-rc1" from Kevin Hilman:
- add / enable USB host support for GX boards
* tag 'amlogic-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic:
ARM64: dts: meson-gxm-khadas-vim2: enable the USB controller
ARM64: dts: meson-gxl-nexbox-a95x: enable the USB controller
ARM64: dts: meson-gxl-s905x-libretech-cc: enable the USB controller
ARM64: dts: meson-gx-p23x-q20x: enable the USB controller
ARM64: dts: meson-gxl-s905x-p212: enable the USB controller
ARM64: dts: meson-gxm: add GXM specific USB host configuration
ARM64: dts: meson-gxl: add USB host support
The DTS file for the NAS4220B had the pin config for the
ethernet interface set to the pins in the SL3512 SoC while
this system is using SL3516. Fix it by referencing the
right SL3516 pins instead of the SL3512 pins.
Cc: stable@vger.kernel.org
Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Reported-by: Andreas Fiedler <andreas.fiedler@gmx.net>
Reported-by: Roman Yeryomin <roman@advem.lv>
Tested-by: Roman Yeryomin <roman@advem.lv>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
A single patch to fix the new DTC warnings probably enabled during
v4.17 merge window.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJa1L5XAAoJEABBurwxfuKYgccP/A1hZt9r2ScgiOSreq7+cdvH
MIjT2sdu6/XtIS+A0yXaLeICHsXi3VBIP7K/Lo7eJt0lo3RR7t+F0Wtht6Thr3Z2
Lax2v7I1UkimSWHSptjKNWO6H1CbAcbwLG5mn5vC2zFxMhfOkaNqz6nI8BNJybXH
Pt5RhFhW/GbQq6rCpp2Beoa4ZTfFRMNXEvtkV+DK874Gh3KDMNUeJWql66YArh9i
c2Ie8yxtrMGpHC2lVTbYlSYYk65XnpNk3Xs0lsG9LjSXLePuru4l7cD+BXL9rCyz
8KReymPLwSqbpWKA40hFk8o3vOK8VdCeU4hOgYckvWYuCpE907x/28RnqT9FJYm0
cHTWugtXGPEPfYgrM1zn/Z0Q9kyeun0iYBFAUZDAP+HNagAtd1isEV9ioqshd59t
BFOR1ueH1z6Kiymg73l9H7/wv8O40R1gPlzfB0xcP1VbggpVI7s8bafj++OaSHDY
1kJ6v+f+qjfITh1nDzLwTf8d94S/bX3QRksdNmEMy3fi1c3m7j+ajlmCgkdu+0Vg
IjpsFrjZ1ptS7W4wJqB9EMIDBghj/E1YaKR41yByfIuvDASm7nwjb9+HAG3sDxAz
+Unx48FZUyv4AqOhTevNh4u8aSCnOu2SULV5srav1vvmyHLDz5NjSpV3YY7/uTqz
kH9zHPpprsNstH2EM8Jl
=fm2S
-----END PGP SIGNATURE-----
Merge tag 'juno-fixes-4.17' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into fixes
Pull "ARMv8 Juno DT fix for v4.17" from Sudeep Holla:
A single patch to fix the new DTC warnings probably enabled during
v4.17 merge window.
* tag 'juno-fixes-4.17' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
arm64: dts: juno: drop unnecessary address-cells and size-cells properties
The SMM freeze feature was introduced since PerfMon V2. But the current
code unconditionally enables the feature for all platforms. It can
generate #GP exception, if the related FREEZE_WHILE_SMM bit is set for
the machine with PerfMon V1.
To disable the feature for PerfMon V1, perf needs to
- Remove the freeze_on_smi sysfs entry by moving intel_pmu_attrs to
intel_pmu, which is only applied to PerfMon V2 and later.
- Check the PerfMon version before flipping the SMM bit when starting CPU
Fixes: 6089327f54 ("perf/x86: Add sysfs entry to freeze counters on SMI")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: ak@linux.intel.com
Cc: eranian@google.com
Cc: acme@redhat.com
Link: https://lkml.kernel.org/r/1524682637-63219-1-git-send-email-kan.liang@linux.intel.com
Arnaldo noticed that the latest kernel is missing the syscall event system
directory in x86. I bisected it down to d5a00528b5 ("syscalls/core,
syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()").
The system call trace events are special, as there is only one trace event
for all system calls (the raw_syscalls). But a macro that wraps the system
calls creates meta data for them that copies the name to find the system
call that maps to the system call table (the number). At boot up, it does a
kallsyms lookup of the system call table to find the function that maps to
the meta data of the system call. If it does not find a function, then that
system call is ignored.
Because the x86 system calls had "__x64_", or "__ia32_" prefixed to the
"sys" for the names, they do not match the default compare algorithm. As
this was a problem for power pc, the algorithm can be overwritten by the
architecture. The solution is to have x86 have its own algorithm to do the
compare and this brings back the system call trace events.
Link: http://lkml.kernel.org/r/20180417174128.0f3457f0@gandalf.local.home
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Fixes: d5a00528b5 ("syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The NMI IPI handler for a receiving CPU increments nmi_ipi_busy_count
over the handler function call, which causes later smp_send_nmi_ipi()
callers to spin until the call is finished.
The stop_this_cpu() function never returns, so the busy count is never
decremeted, which can cause the system to hang in some cases. For
example panic() will call smp_send_stop() early on which calls
stop_this_cpu() on other CPUs, then later in the reboot path,
pnv_restart() will call smp_send_stop() again, which hangs.
Fix this by adding a special case to the stop_this_cpu() handler to
decrement the busy count, because it will never return.
Now that the NMI/non-NMI versions of stop_this_cpu() are different,
split them out into separate functions rather than doing #ifdef tricks
to share the body between the two functions.
Fixes: 6bed323762 ("powerpc: use NMI IPI for smp_send_stop")
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out the functions, tweak change log a bit]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
commit ce9962bf7e22bb3891655c349faff618922d4a73
0day reported warnings at boot on 32-bit systems without NX support:
attempted to set unsupported pgprot: 8000000000000025 bits: 8000000000000000 supported: 7fffffffffffffff
WARNING: CPU: 0 PID: 1 at
arch/x86/include/asm/pgtable.h:540 handle_mm_fault+0xfc1/0xfe0:
check_pgprot at arch/x86/include/asm/pgtable.h:535
(inlined by) pfn_pte at arch/x86/include/asm/pgtable.h:549
(inlined by) do_anonymous_page at mm/memory.c:3169
(inlined by) handle_pte_fault at mm/memory.c:3961
(inlined by) __handle_mm_fault at mm/memory.c:4087
(inlined by) handle_mm_fault at mm/memory.c:4124
The problem is that due to the recent commit which removed auto-massaging
of page protections, filtering page permissions at PTE creation time is not
longer done, so vma->vm_page_prot is passed unfiltered to PTE creation.
Filter the page protections before they are installed in vma->vm_page_prot.
Fixes: fb43d6cb91 ("x86/mm: Do not auto-massage page protections")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: https://lkml.kernel.org/r/20180420222028.99D72858@viggo.jf.intel.com
commit 26d35ca6c3776784f8156e1d6f80cc60d9a2a915
RANDSTRUCT derives its hardening benefits from the attacker's lack of
knowledge about the layout of kernel data structures. Keep the kernel
image non-global in cases where RANDSTRUCT is in use to help keep the
layout a secret.
Fixes: 8c06c7740 (x86/pti: Leave kernel text global for !PCID)
Reported-by: Kees Cook <keescook@google.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: https://lkml.kernel.org/r/20180420222026.D0B4AAC9@viggo.jf.intel.com
commit abb67605203687c8b7943d760638d0301787f8d9
Kees reported to me that I made too much of the kernel image global.
It was far more than just text:
I think this is too much set global: _end is after data,
bss, and brk, and all kinds of other stuff that could
hold secrets. I think this should match what
mark_rodata_ro() is doing.
This does exactly that. We use __end_rodata_hpage_align as our
marker both because it is huge-page-aligned and it does not contain
any sections we expect to hold secrets.
Kees's logic was that r/o data is in the kernel image anyway and,
in the case of traditional distributions, can be freely downloaded
from the web, so there's no reason to hide it.
Fixes: 8c06c7740 (x86/pti: Leave kernel text global for !PCID)
Reported-by: Kees Cook <keescook@google.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: https://lkml.kernel.org/r/20180420222023.1C8B2B20@viggo.jf.intel.com
commit 231df823c4f04176f607afc4576c989895cff40e
The pageattr.c code attempts to process "faults" when it goes looking
for PTEs to change and finds non-present entries. It allows these
faults in the linear map which is "expected to have holes", but
WARN()s about them elsewhere, like when called on the kernel image.
However, change_page_attr_clear() is now called on the kernel image in the
process of trying to clear the Global bit.
This trips the warning in __cpa_process_fault() if a non-present PTE is
encountered in the kernel image. The "holes" in the kernel image result
from free_init_pages()'s use of set_memory_np(). These holes are totally
fine, and result from normal operation, just as they would be in the kernel
linear map.
Just silence the warning when holes in the kernel image are encountered.
Fixes: 39114b7a7 (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image)
Reported-by: Mariusz Ceier <mceier@gmail.com>
Reported-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Kees Cook <keescook@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: https://lkml.kernel.org/r/20180420222021.1C7D2B3F@viggo.jf.intel.com
commit 16dce603adc9de4237b7bf2ff5c5290f34373e7b
Part of the global bit _setting_ patches also includes clearing the
Global bit when it should not be enabled. That is done with
set_memory_nonglobal(), which uses change_page_attr_clear() in
pageattr.c under the covers.
The TLB flushing code inside pageattr.c has has checks like
BUG_ON(irqs_disabled()), looking for interrupt disabling that might
cause deadlocks. But, these also trip in early boot on certain
preempt configurations. Just copy the existing BUG_ON() sequence from
cpa_flush_range() to the other two sites and check for early boot.
Fixes: 39114b7a7 (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image)
Reported-by: Mariusz Ceier <mceier@gmail.com>
Reported-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Kees Cook <keescook@google.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: https://lkml.kernel.org/r/20180420222019.20C4A410@viggo.jf.intel.com
The OPAL RTC driver does not sleep in case it gets OPAL_BUSY or
OPAL_BUSY_EVENT from firmware, which causes large scheduling
latencies, up to 50 seconds have been observed here when RTC stops
responding (BMC reboot can do it).
Fix this by converting it to the standard form OPAL_BUSY loop that
sleeps.
Fixes: 628daa8d5a ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit fb8722735f ("arm64: support __int128 on gcc 5+") added support
for arm64 __int128 with gcc with a version-conditional, but neglected to
enable this for clang, which in fact appears to support aarch64 __int128.
This commit therefore enables it if the compiler is clang, using the
same type of makefile conditional used elsewhere in the tree.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Our arm64_skip_faulting_instruction() helper advances the userspace
singlestep state machine, but this is also called by the kernel BRK
handler, as used for WARN*().
Thus, if we happen to hit a WARN*() while the user singlestep state
machine is in the active-no-pending state, we'll advance to the
active-pending state without having executed a user instruction, and
will take a step exception earlier than expected when we return to
userspace.
Let's fix this by only advancing the state machine when skipping a user
instruction.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit a257e02579 ("arm64/kernel: don't ban ADRP to work around
Cortex-A53 erratum #843419") introduced a function whose name ends with
"_veneer".
This clashes with commit bd8b22d288 ("Kbuild: kallsyms: ignore veneers
emitted by the ARM linker"), which removes symbols ending in "_veneer"
from kallsyms.
The problem was manifested as 'perf test -vvvvv vmlinux' failed,
correctly claiming the symbol 'module_emit_adrp_veneer' was present in
vmlinux, but not in kallsyms.
...
ERR : 0xffff00000809aa58: module_emit_adrp_veneer not on kallsyms
...
test child finished with -1
---- end ----
vmlinux symtab matches kallsyms: FAILED!
Fix the problem by renaming module_emit_adrp_veneer to
module_emit_veneer_for_adrp. Now the test passes.
Fixes: a257e02579 ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We transiently switch to KERNEL_DS in compat_ptrace_gethbpregs() and
compat_ptrace_sethbpregs(), but in either case this is pointless as we
don't perform any uaccess during this window.
let's rip out the redundant addr_limit manipulation.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Debian toolcahin defaults to PIE, and I guess that will also be the case
of most distributions. This causes the following build failure:
AS arch/riscv/kernel/vdso/getcpu.o
AS arch/riscv/kernel/vdso/flush_icache.o
VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg
OBJCOPY arch/riscv/kernel/vdso/vdso.so
AS arch/riscv/kernel/vdso/vdso.o
VDSOLD arch/riscv/kernel/vdso/vdso-dummy.o
LD arch/riscv/kernel/vdso/vdso-syms.o
riscv64-linux-gnu-ld: attempted static link of dynamic object `arch/riscv/kernel/vdso/vdso-dummy.o'
make[2]: *** [arch/riscv/kernel/vdso/Makefile:43: arch/riscv/kernel/vdso/vdso-syms.o] Error 1
make[1]: *** [scripts/Makefile.build:575: arch/riscv/kernel/vdso] Error 2
make: *** [Makefile:1018: arch/riscv/kernel] Error 2
While the root Makefile correctly passes "-fno-PIE" to build individual
object files, the RISC-V kernel also builds vdso-dummy.o as an
executable, which is therefore linked as PIE. Fix that by updating this
specific link rule to also include "-no-pie".
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
DMA_DIRECT_OPS is defined in lib/Kconfig, so don't duplicate it in
arch/riscv/Kconfig.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
The addr parameter isn't used for anything. Let's simplify and get rid of
it, like arm.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The current code extracts the physical address for UE errors and then
hooks it up into memory failure infrastructure. On successful
extraction of physical address it wrongly sets "handled = 1" which
means this UE error has been recovered. Since MCE handler gets return
value as handled = 1, it assumes that error has been recovered and
goes back to same NIP. This causes MCE interrupt again and again in a
loop leading to hard lockup.
Also, initialize phys_addr to ULONG_MAX so that we don't end up
queuing undesired page to hwpoison.
Without this patch we see:
Severe Machine check interrupt [Recovered]
NIP: [000000001002588c] PID: 7109 Comm: find
Initiator: CPU
Error type: UE [Load/Store]
Effective address: 00007fffd2755940
Physical address: 000020181a080000
...
Severe Machine check interrupt [Recovered]
NIP: [000000001002588c] PID: 7109 Comm: find
Initiator: CPU
Error type: UE [Load/Store]
Effective address: 00007fffd2755940
Physical address: 000020181a080000
Severe Machine check interrupt [Recovered]
NIP: [000000001002588c] PID: 7109 Comm: find
Initiator: CPU
Error type: UE [Load/Store]
Effective address: 00007fffd2755940
Physical address: 000020181a080000
Memory failure: 0x20181a08: recovery action for dirty LRU page: Recovered
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
...
Watchdog CPU:38 Hard LOCKUP
After this patch we see:
Severe Machine check interrupt [Not recovered]
NIP: [00007fffaae585f4] PID: 7168 Comm: find
Initiator: CPU
Error type: UE [Load/Store]
Effective address: 00007fffaafe28ac
Physical address: 00002017c0bd0000
find[7168]: unhandled signal 7 at 00007fffaae585f4 nip 00007fffaae585f4 lr 00007fffaae585e0 code 4
Memory failure: 0x2017c0bd: recovery action for dirty LRU page: Recovered
Fixes: 01eaac2b05 ("powerpc/mce: Hookup ierror (instruction) UE errors")
Fixes: ba41e1e1cc ("powerpc/mce: Hookup derror (load/store) UE errors")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The NPU has a limited number of address translation shootdown (ATSD)
registers and the GPU has limited bandwidth to process ATSDs. This can
result in contention of ATSD registers leading to soft lockups on some
threads, particularly when invalidating a large address range in
pnv_npu2_mn_invalidate_range().
At some threshold it becomes more efficient to flush the entire GPU
TLB for the given MM context (PID) than individually flushing each
address in the range. This patch will result in ranges greater than
2MB being converted from 32+ ATSDs into a single ATSD which will flush
the TLB for the given PID on each GPU.
Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Tested-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There is a single npu context per set of callback parameters. Callers
should be prevented from overwriting existing callback values so
instead return an error if different parameters are passed.
Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Mark Hairgrove <mhairgrove@nvidia.com>
Tested-by: Mark Hairgrove <mhairgrove@nvidia.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The pnv_npu2_init_context() and pnv_npu2_destroy_context() functions
are used to allocate/free contexts to allow address translation and
shootdown by the NPU on a particular GPU. Context initialisation is
implicitly safe as it is protected by the requirement mmap_sem be held
in write mode, however pnv_npu2_destroy_context() does not require
mmap_sem to be held and it is not safe to call with a concurrent
initialisation for a different GPU.
It was assumed the driver would ensure destruction was not called
concurrently with initialisation. However the driver may be simplified
by allowing concurrent initialisation and destruction for different
GPUs. As npu context creation/destruction is not a performance
critical path and the critical section is not large a single spinlock
is used for simplicity.
Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reviewed-by: Mark Hairgrove <mhairgrove@nvidia.com>
Tested-by: Mark Hairgrove <mhairgrove@nvidia.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Don't do this via custom code, instead now that we have support in the
arch hotplug/hotunplug code, rely on those routines to do the right
thing.
The existing flush doesn't work because it uses ppc64_caches.l1d.size
instead of ppc64_caches.l1d.line_size.
Fixes: 9d5171a8f2 ("powerpc/powernv: Enable removal of memory for in memory tracing")
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch adds support for flushing potentially dirty cache lines
when memory is hot-plugged/hot-un-plugged. The support is currently
limited to 64 bit systems.
The bug was exposed when mappings for a device were actually
hot-unplugged and plugged in back later. A similar issue was observed
during the development of memtrace, but memtrace does it's own
flushing of region via a custom routine.
These patches do a flush both on hotplug/unplug to clear any stale
data in the cache w.r.t mappings, there is a small race window where a
clean cache line may be created again just prior to tearing down the
mapping.
The patches were tested by disabling the flush routines in memtrace
and doing I/O on the trace file. The system immediately
checkstops (quite reliablly if prior to the hot-unplug of the memtrace
region, we memset the regions we are about to hot unplug). After these
patches no custom flushing is needed in the memtrace code.
Fixes: 9d5171a8f2 ("powerpc/powernv: Enable removal of memory for in memory tracing")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We're missing a sentinel entry in kpti_safe_list. Thus is_midr_in_range_list()
can walk past the end of kpti_safe_list. Depending on the contents of memory,
this could erroneously match a CPU's MIDR, cause a data abort, or other bad
outcomes.
Add the sentinel entry to avoid this.
Fixes: be5b299830 ("arm64: capabilities: Add support for checks based on a list of MIDRs")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Tested-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The main linker script vmlinux.lds.S for the kernel image merges
the expoline code patch tables into two section ".nospec_call_table"
and ".nospec_return_table". This is *not* done for the modules,
there the sections retain their original names as generated by gcc:
".s390_indirect_call", ".s390_return_mem" and ".s390_return_reg".
The module_finalize code has to check for the compiler generated
section names, otherwise no code patching is done. This slows down
the module code in case of "spectre_v2=off".
Cc: stable@vger.kernel.org # 4.16
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In a multi-threaded program any thread can call execve(). If this
is not done by the thread group leader, the de_thread() function
replaces the pid of the task that calls execve() with the pid of
thread group leader. If the task reaches user space again without
going over __switch_to() the sampling tag is still set to the old
pid.
Define the arch_setup_new_exec function to verify the task pid
and udpate the tag with LPP if it has changed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>