code that was thought unnecessary; however, since then KVM has grown quite
a few cond_resched()s and for that reason the simplified code is prone to
livelocks---one CPUs tries to empty a list of guest page tables while the
others keep adding to them. This adds back the generation-based zapping of
guest page tables, which was not unnecessary after all.
On top of this, there is a fix for a kernel memory leak and a couple of
s390 fixlets as well.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdfJcxAAoJEL/70l94x66D8ocH/jhz+95+TBNF5j0xGmBWiDbH
zQlWmEMkAQ8o66J+503bc2ltQhM8078Ohumgtmq8HFaRgctDiRdjLBcce6aOr4tH
09gBdlWgkVWLhN8AhydSbHh+SLcCWdZQSAWQrvt1aM2BRdz5WECkUdauSL2oHsTP
P58318EL0JrqQaQZtK4qI4eNDiYWdKq2luMu9BTYm3f1Hnk28gorErgBehScYuxE
LlnM4RYedH54UR8opdUDmhHxO7bGTW4SVz4obq0sSOJs190DuVbCxbaJjhP+tSwk
7hPac3tHoYFUIhVtgGD3N/LrsFgvcShmjawLP0szCrR2sWrtTqmb0R63DLxpmyg=
=yuZ9
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"The main change here is a revert of reverts. We recently simplified
some code that was thought unnecessary; however, since then KVM has
grown quite a few cond_resched()s and for that reason the simplified
code is prone to livelocks---one CPUs tries to empty a list of guest
page tables while the others keep adding to them. This adds back the
generation-based zapping of guest page tables, which was not
unnecessary after all.
On top of this, there is a fix for a kernel memory leak and a couple
of s390 fixlets as well"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot
KVM: x86: work around leak of uninitialized stack contents
KVM: nVMX: handle page fault in vmread
KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl
KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset()
Last week, Palmer and I learned that there was an error in the RISC-V
kernel image header format that could make it less compatible with the
ARM64 kernel image header format. I had missed this error during my
original reviews of the patch.
The kernel image header format is an interface that impacts
bootloaders, QEMU, and other user tools. Those packages must be
updated to align with whatever is merged in the kernel. We would like
to avoid proliferating these image formats by keeping the RISC-V
header as close as possible to the existing ARM64 header. Since the
arch/riscv patch that adds support for the image header was merged
with our v5.3-rc1 pull request as commit 0f327f2aaa ("RISC-V: Add
an Image header that boot loader can parse."), we think it wise to try
to fix this error before v5.3 is released.
The fix itself should be backwards-compatible with any project that
has already merged support for premature versions of this interface.
It primarily involves ensuring that the RISC-V image header has
something useful in the same field as the ARM64 image header.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl1870sACgkQx4+xDQu9
Kkt33A//SG4fTIyz0rIGTZpJPKV3nXacBq6XvOFxFsHRHlEvD2f/JkSK1Ab+hV5R
vmTkVGCSCVz1C/OEA+KWsjuiJEglII6eOLIRqST1Wm6KumwAwLc78xdgEb1Sm/SC
E7OTYtSqbUjCqzzD1BFcXfbP4mGF/9IBjWI3OcCnb1UcLuL29Mt35gvxI9fF1FB6
+EU96MBQbk4gVUYjKXObTvaAZwWIYrMkOFQmdRgb4jqk42i0hLmKx//WkI1Ajlp8
FDjE2nIo2NAt0N7pImJ/QtqxkOsQjMtOyOscoTyhB4eGJW0+fTyVrt6FpUdYQDQq
vZI/WS2RFYUi2wfj+JNQ959MgsWZZ8z21KbFWwR0HC4k2xRZaxCO48g/VweJA/QW
3f6+CMxYgwF5KzToHvUjlo0wNMW2Xo/FX9bky3gb8rJPWnSx9uu9lfoh17FUD4Ty
cEknaLtmMALA8Lgr8hwTKbZLg7J1ih5r1SPj0UvjpjEmwDUl2doA0EONuuBroEHM
KDerGitg6D0g4B4VlGsHuLMd6Gj/5r2teno97tPoaf5J9mCZ1v2/Q5OL0QwBYd84
5cp+Ox1aQTY6SJq8gftBOD3MmW2lKCC5tT6H0bJvKBAE7tJaLPv5YIj6dp1jfXKB
klzJUdGRsL60EwlL/cbFOurDfhBeQlq8akdzG5Cg5e8q+mISSTE=
=Jt6U
-----END PGP SIGNATURE-----
Merge tag 'riscv/for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fix from Paul Walmsley:
"Last week, Palmer and I learned that there was an error in the RISC-V
kernel image header format that could make it less compatible with the
ARM64 kernel image header format. I had missed this error during my
original reviews of the patch.
The kernel image header format is an interface that impacts
bootloaders, QEMU, and other user tools. Those packages must be
updated to align with whatever is merged in the kernel. We would like
to avoid proliferating these image formats by keeping the RISC-V
header as close as possible to the existing ARM64 header. Since the
arch/riscv patch that adds support for the image header was merged
with our v5.3-rc1 pull request as commit 0f327f2aaa ("RISC-V: Add
an Image header that boot loader can parse."), we think it wise to try
to fix this error before v5.3 is released.
The fix itself should be backwards-compatible with any project that
has already merged support for premature versions of this interface.
It primarily involves ensuring that the RISC-V image header has
something useful in the same field as the ARM64 image header"
* tag 'riscv/for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: modify the Image header to improve compatibility with the ARM64 header
- prevent a user triggerable oops in the migration code
- do not leak kernel stack content
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJdejosAAoJEBF7vIC1phx8ZcYP/09WMmcbOexGvopqyMzIWgAv
xpSHAW0+mGriu9b41OwkxBsMG3MxUzk86b3zL0r5eaigWXSuE2NU0OhScqF9ehMX
pTtoeSzFJsPFwGQrOKIhpgcNzOJ+YfVqTDlf5dxq9uSNYF32suuz0Dw4P9PdFJOg
k8prJXiKu+bL21TcbhWsAAP7Gb5/DA26p4d5KM3wJe351Af9lrLrDF2z+pKe9fbY
v0vMcH3tJoBOOTYUSJeptEWU9OlYljMrJN7kkmXCEC8yklwoXPDNgAC8Yg2SfqYM
xNKVkX/rY97cn1Dq0LpAvEjMDYvu7KbOM1qQE9A67gRLIjuGJnDyEa+j/iB/tOrz
BMmTdut44XRaVZVdDL+d2pg3LKI+1+UV4XTwpD4g1tSpYLar3dJVb9mq00OzdCAg
TsK+pQYTSZig+H4ubtikgm9pFGKOB2Jsp2+FoC7jYxhYQWyj4syBkSoaaUdY0LvE
/Du3NY3RaG4yi2K2XV0yjBVAjpXxYMWqvzJYTC9XlrEQJ5nAmiefTgxZmcg4ZCMw
0YVRigG7vz8oKpVRl/6smGd/U+qTNZN4cXnFgUr71yONiIxsSndUZ/Yledtf+KQR
uzPfvIwYpRzwqVnXkkFb+PNxvJVftCbe2rRI4D549VsbmEJmSadjiB5aW1Rj3fMN
47ZjXZmmGETR8BtQEM37
=LxGy
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-master-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master
KVM: s390: Fixes for 5.3
- prevent a user triggerable oops in the migration code
- do not leak kernel stack content
James Harvey reported a livelock that was introduced by commit
d012a06ab1 ("Revert "KVM: x86/mmu: Zap only the relevant pages when
removing a memslot"").
The livelock occurs because kvm_mmu_zap_all() as it exists today will
voluntarily reschedule and drop KVM's mmu_lock, which allows other vCPUs
to add shadow pages. With enough vCPUs, kvm_mmu_zap_all() can get stuck
in an infinite loop as it can never zap all pages before observing lock
contention or the need to reschedule. The equivalent of kvm_mmu_zap_all()
that was in use at the time of the reverted commit (4e103134b8, "KVM:
x86/mmu: Zap only the relevant pages when removing a memslot") employed
a fast invalidate mechanism and was not susceptible to the above livelock.
There are three ways to fix the livelock:
- Reverting the revert (commit d012a06ab1) is not a viable option as
the revert is needed to fix a regression that occurs when the guest has
one or more assigned devices. It's unlikely we'll root cause the device
assignment regression soon enough to fix the regression timely.
- Remove the conditional reschedule from kvm_mmu_zap_all(). However, although
removing the reschedule would be a smaller code change, it's less safe
in the sense that the resulting kvm_mmu_zap_all() hasn't been used in
the wild for flushing memslots since the fast invalidate mechanism was
introduced by commit 6ca18b6950 ("KVM: x86: use the fast way to
invalidate all pages"), back in 2013.
- Reintroduce the fast invalidate mechanism and use it when zapping shadow
pages in response to a memslot being deleted/moved, which is what this
patch does.
For all intents and purposes, this is a revert of commit ea145aacf4
("Revert "KVM: MMU: fast invalidate all pages"") and a partial revert of
commit 7390de1e99 ("Revert "KVM: x86: use the fast way to invalidate
all pages""), i.e. restores the behavior of commit 5304b8d37c ("KVM:
MMU: fast invalidate all pages") and commit 6ca18b6950 ("KVM: x86:
use the fast way to invalidate all pages") respectively.
Fixes: d012a06ab1 ("Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"")
Reported-by: James Harvey <jamespharvey20@gmail.com>
Cc: Alex Willamson <alex.williamson@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Emulation of VMPTRST can incorrectly inject a page fault
when passed an operand that points to an MMIO address.
The page fault will use uninitialized kernel stack memory
as the CR2 and error code.
The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
exit to userspace; however, it is not an easy fix, so for now just ensure
that the error code and CR2 are zero.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Cc: stable@vger.kernel.org
[add comment]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The implementation of vmread to memory is still incomplete, as it
lacks the ability to do vmread to I/O memory just like vmptrst.
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Part of the intention during the definition of the RISC-V kernel image
header was to lay the groundwork for a future merge with the ARM64
image header. One error during my original review was not noticing
that the RISC-V header's "magic" field was at a different size and
position than the ARM64's "magic" field. If the existing ARM64 Image
header parsing code were to attempt to parse an existing RISC-V kernel
image header format, it would see a magic number 0. This is
undesirable, since it's our intention to align as closely as possible
with the ARM64 header format. Another problem was that the original
"res3" field was not being initialized correctly to zero.
Address these issues by creating a 32-bit "magic2" field in the RISC-V
header which matches the ARM64 "magic" field. RISC-V binaries will
store "RSC\x05" in this field. The intention is that the use of the
existing 64-bit "magic" field in the RISC-V header will be deprecated
over time. Increment the minor version number of the file format to
indicate this change, and update the documentation accordingly. Fix
the assembler directives in head.S to ensure that reserved fields are
properly zero-initialized.
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Reported-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Cc: Atish Patra <atish.patra@wdc.com>
Cc: Karsten Merker <merker@debian.org>
Link: https://lore.kernel.org/linux-riscv/194c2f10c9806720623430dbf0cc59a965e50448.camel@wdc.com/T/#u
Link: https://lore.kernel.org/linux-riscv/mhng-755b14c4-8f35-4079-a7ff-e421fd1b02bc@palmer-si-x1e/T/#t
Pull x86 fixes from Ingo Molnar:
"A KVM guest fix, and a kdump kernel relocation errors fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/timer: Force PIT initialization when !X86_FEATURE_ARAT
x86/purgatory: Change compiler flags from -mcmodel=kernel to -mcmodel=large to fix kexec relocation errors
When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
an interrupt, we convert them from the legacy struct kvm_s390_interrupt
to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
However, this function does not take care of all types of interrupts
that we can inject into the guest later (see do_inject_vcpu()). Since we
do not clear out the s390irq values before calling s390int_to_s390irq(),
there is a chance that we copy random data from the kernel stack which
could be leaked to the userspace later.
Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
interrupt: s390int_to_s390irq() does not handle it, and the function
__inject_pfault_init() later copies irq->u.ext which contains the
random kernel stack data. This data can then be leaked either to
the guest memory in __deliver_pfault_init(), or the userspace might
retrieve it directly with the KVM_S390_GET_IRQ_STATE ioctl.
Fix it by handling that interrupt type in s390int_to_s390irq(), too,
and by making sure that the s390irq struct is properly pre-initialized.
And while we're at it, make sure that s390int_to_s390irq() now
directly returns -EINVAL for unknown interrupt types, so that we
immediately get a proper error code in case we add more interrupt
types to do_inject_vcpu() without updating s390int_to_s390irq()
sometime in the future.
Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/kvm/20190912115438.25761-1-thuth@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If userspace doesn't set KVM_MEM_LOG_DIRTY_PAGES on memslot before calling
kvm_s390_vm_start_migration(), kernel will oops with:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:0000000002a2000b R2:00000001bff8c00b R3:00000001bff88007 S:00000001bff91000 P:000000000000003d
Oops: 0004 ilc:2 [#1] SMP
...
Call Trace:
([<001fffff804ec552>] kvm_s390_vm_set_attr+0x347a/0x3828 [kvm])
[<001fffff804ecfc0>] kvm_arch_vm_ioctl+0x6c0/0x1998 [kvm]
[<001fffff804b67e4>] kvm_vm_ioctl+0x51c/0x11a8 [kvm]
[<00000000008ba572>] do_vfs_ioctl+0x1d2/0xe58
[<00000000008bb284>] ksys_ioctl+0x8c/0xb8
[<00000000008bb2e2>] sys_ioctl+0x32/0x40
[<000000000175552c>] system_call+0x2b8/0x2d8
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<0000000000dbaf60>] __memset+0xc/0xa0
due to ms->dirty_bitmap being NULL, which might crash the host.
Make sure that ms->dirty_bitmap is set before using it or
return -EINVAL otherwise.
Cc: <stable@vger.kernel.org>
Fixes: afdad61615 ("KVM: s390: Fix storage attributes migration with memory slots")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/kvm/20190911075218.29153-1-imammedo@redhat.com/
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
These are two regression fixes for bugs that got introduced
during the system call rework that went into linux-5.1
but only bisected and fixed now:
- One patch affects semtimedop() on many of the less
common 32-bit architectures, this just needs a single-line
bugfix.
- The other affects only sparc64 and has a slightly more
invasive workaround to apply the same change to sparc64
that was done to the generic code used everywhere else.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJddAhgAAoJEJpsee/mABjZCFsP/iV1iH6ezPC4thGazZ1n9wHw
UOi+lVJPN3qZ2HmeSGYCFfktpbkj6UUT6QdwxfvZkD3EtKf1Zl+lMgJWAgE1/aUa
ikCyKjSsnW3cv4Yd5uX9gYSzIdKrGY2XrDjq1J3hjUMPpZfj03ndKCYHEbMLDrO8
g91ZGUuHqxW0NCkDq6mGed2d0FOCLgwHqqwEEZgvaEzMDv1xG2TzrxS566YfxSVA
HTD7JjsaEh02Q3AYwpq3MOxWXxDQE0b7YI+LoEs8Maqi+0zSh946vMVloo/a1pv2
SK9ol1zGQOwCojBTevcpHwdeeO2mYvjwDOa0aFucErAXiBPROXzFGIT/ZJ0WpZuf
vI0JCCD3Rfp/oXRZePuToN2oE9HrrR6myhAMn+Nza3SbPlT/sQEaYvuStV/s8pQo
HUNY234zACeRM5mPFUlbfYOBKL1loCpd+OjOA4fMoXGaG8HpnFBjnmz32yzo01zM
oAiHzLXBeMLhJhZEX1b80rwbg4k/Kjn/ZQUEkQ0rc0+YEzXx/xd1FOsYtUgsDGFV
hPvE7mO++1w9Wnu7dAIiCmk3qufF3oLjEngfHBcRy4gNxks7Ub9zEQwj9gCqbugq
iHVQFQx25ToMnWHJ9ferkzj+eP1yFlkRiteoXMK0w5mU+3iF0qoTUbeMoucwlOmK
VMncXfoiOdaOvQB6tQ/t
=XsII
-----END PGP SIGNATURE-----
Merge tag 'ipc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull ipc regression fixes from Arnd Bergmann:
"Fix ipc regressions from y2038 patches
These are two regression fixes for bugs that got introduced during the
system call rework that went into linux-5.1 but only bisected and
fixed now:
- One patch affects semtimedop() on many of the less common 32-bit
architectures, this just needs a single-line bugfix.
- The other affects only sparc64 and has a slightly more invasive
workaround to apply the same change to sparc64 that was done to the
generic code used everywhere else"
* tag 'ipc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
ipc: fix sparc64 ipc() wrapper
ipc: fix semtimedop for generic 32-bit architectures
KVM guests with commit c8c4076723 ("x86/timer: Skip PIT initialization on
modern chipsets") applied to guest kernel have been observed to have
unusually higher CPU usage with symptoms of increase in vm exits for HLT
and MSW_WRITE (MSR_IA32_TSCDEADLINE).
This is caused by older QEMUs lacking support for X86_FEATURE_ARAT. lapic
clock retains CLOCK_EVT_FEAT_C3STOP and nohz stays inactive. There's no
usable broadcast device either.
Do the PIT initialization if guest CPU lacks X86_FEATURE_ARAT. On real
hardware it shouldn't matter as ARAT and DEADLINE come together.
Fixes: c8c4076723 ("x86/timer: Skip PIT initialization on modern chipsets")
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This reverts commit 558682b529.
Chris Wilson reports that it breaks his CPU hotplug test scripts. In
particular, it breaks offlining and then re-onlining the boot CPU, which
we treat specially (and the BIOS does too).
The symptoms are that we can offline the CPU, but it then does not come
back online again:
smpboot: CPU 0 is now offline
smpboot: Booting Node 0 Processor 0 APIC 0x0
smpboot: do_boot_cpu failed(-1) to wakeup CPU#0
Thomas says he knows why it's broken (my personal suspicion: our magic
handling of the "cpu0_logical_apicid" thing), but for 5.3 the right fix
is to just revert it, since we've never touched the LDR bits before, and
it's not worth the risk to do anything else at this stage.
[ Hotpluging of the boot CPU is special anyway, and should be off by
default. See the "BOOTPARAM_HOTPLUG_CPU0" config option and the
cpu0_hotplug kernel parameter.
In general you should not do it, and it has various known limitations
(hibernate and suspend require the boot CPU, for example).
But it should work, even if the boot CPU is special and needs careful
treatment - Linus ]
Link: https://lore.kernel.org/lkml/156785100521.13300.14461504732265570003@skylake-alporthouse-com/
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bandan Das <bsd@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt bisected a sparc64 specific issue with semctl, shmctl and msgctl
to a commit from my y2038 series in linux-5.1, as I missed the custom
sys_ipc() wrapper that sparc64 uses in place of the generic version that
I patched.
The problem is that the sys_{sem,shm,msg}ctl() functions in the kernel
now do not allow being called with the IPC_64 flag any more, resulting
in a -EINVAL error when they don't recognize the command.
Instead, the correct way to do this now is to call the internal
ksys_old_{sem,shm,msg}ctl() functions to select the API version.
As we generally move towards these functions anyway, change all of
sparc_ipc() to consistently use those in place of the sys_*() versions,
and move the required ksys_*() declarations into linux/syscalls.h
The IS_ENABLED(CONFIG_SYSVIPC) check is required to avoid link
errors when ipc is disabled.
Reported-by: Matt Turner <mattst88@gmail.com>
Fixes: 275f22148e ("ipc: rename old-style shmctl/semctl/msgctl syscalls")
Cc: stable@vger.kernel.org
Tested-by: Matt Turner <mattst88@gmail.com>
Tested-by: Anatoly Pugachev <matorola@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
There are three more fixes for this week:
- The Windows-on-ARM laptops require a workaround to
prevent crashing at boot from ACPI
- The Renesas "draak" board needs one bugfix for
the backlight regulator
- Also for Renesas, the "hihope" board accidentally
had its eMMC turned off in the 5.3 merge window.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJdcrXdAAoJEJpsee/mABjZ99QP/0Js6+FE+JXOsiUuREdiU8JC
CmKPbtPT+IyOP+068bpkU1cWWVJ0fyF126D7mfsQfL/VrQTnqCAqkytBidUmh5kX
LT0392kaB+zLlbCgxX3xdBOBpMT2j/kFE9YbtBQG59867zH1Y5q6U/Pek0lWze39
llraW2Nlwr/SW+Ffw0tGLXi8dl9FVYNl3jNKfK2/EM51kEeqwTcJvI0WXgOtsUK3
oiEamod7mQNGEcqnxWf/W9Pj76Y8dlw5H7Q2aTqsb4bYOB0QdCUZqiLfXhyQn0qZ
wy3bnC5Y+nIg9N2I4GRp0MLQ54xdN41gZtX25OdmPIvawxnf2dfwinN9EeGxUn3G
RTgv4eBRRNVlheEdUkoyvLaz9jtOO5NCebti9foYcZv9HDEoIHhCqCjgmU7DrW+2
eP+6pTwnbokbCDtsQ4okfJrAlMkOz4ynGs3sDwzqyxwXKr8Ez0Gho7nVbrRakH4A
fOmDAZEqcNNGCUC5S1LLZNVLC0hp7HDb50uqJnVcirSxw2Qr/RPxw7lY4iaG1Gcd
g8NXmLlkWIGpKe4VofPDqRq7Z5UrpAlefpaO3YVV5k1GH3Z6qmDZpP2ItEdue4yr
wagaHBuv/eJD8xdEij4uPzx+TshZ7JHcO4IjnTE9rUaPi62tyNigJsuik8LNq5zv
Lm3ZJwXLhzUDG3XUS+JD
=f5ge
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"There are three more fixes for this week:
- The Windows-on-ARM laptops require a workaround to prevent crashing
at boot from ACPI
- The Renesas 'draak' board needs one bugfix for the backlight
regulator
- Also for Renesas, the 'hihope' board accidentally had its eMMC
turned off in the 5.3 merge window"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
soc: qcom: geni: Provide parameter error checking
arm64: dts: renesas: hihope-common: Fix eMMC status
arm64: dts: renesas: r8a77995: draak: Fix backlight regulator name
One fix for a boot hang on some Freescale machines when PREEMPT is enabled.
Two CVE fixes for bugs in our handling of FP registers and transactional memory,
both of which can result in corrupted FP state, or FP state leaking between
processes.
Thanks to:
Chris Packham, Christophe Leroy, Gustavo Romero, Michael Neuling.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl1x06oTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgCZzD/90EyaWJVS8WPZopoIdnuOfB/F7EZFY
Lhgd640S1p4o8BUZaQ1T19JOzp6HlO38myOptBufY0BsIJW0M2GwngnBPzSPW8r7
ImTTf5cU0CDe2m3OJdfBrVpnGmUsmoWxwrsFJZ9wbsXhCwbbUzOUuxD/B9wBIGi/
sPpTlaYZBhu3cKs9EWPKAODJhtEf55Q1c62gftfj8Y5u8uxQGinYInCghAUr+3Zv
uCw1CSxOV7yGxfgc1sbOptidOiG4Pljw4EDCUFLpjWTYgPVERASbPHs3C4xuAHGq
IYuNDUJbwrxMU9BKLFzvL4MKWa5XtzLE34oY8SuyyVAbIQTszgCn2rIwlJXH88PO
UtId9accmS+dy2lRI+90dC0qeTgUUIZXS1NF0cl5YNRN0TlMyjHL2/sRxCZF2svF
EaGNjTQLAsfX0ccO9xQr8+KBSfFURMEkO8QQAR0lzJmIgbvSuzfjlZpbcYd2Nqfe
EiYU4GeAQSn14vi0ZMdRWxc1rki9pPhGkrUwToDALsiEedRB03olM955uecf7fra
S8MzHFBYh8Apd/lsAj53uAbL2rIHDJ5+6/eezYp7bRbo6FlvWDs9kmYTX3p3ixq1
Q4gDHfbwnWxxhjUBri5QNZF9YHgkyGPURGpIbdXk9R4Hc7ihQWwDBcSrueca51Ug
m97SLF5/+yWx0A==
=C+wa
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"One fix for a boot hang on some Freescale machines when PREEMPT is
enabled.
Two CVE fixes for bugs in our handling of FP registers and
transactional memory, both of which can result in corrupted FP state,
or FP state leaking between processes.
Thanks to: Chris Packham, Christophe Leroy, Gustavo Romero, Michael
Neuling"
* tag 'powerpc-5.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts
powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction
powerpc/64e: Drop stale call to smp_processor_id() which hangs SMP startup
The last change to this Makefile caused relocation errors when loading
a kdump kernel. Restore -mcmodel=large (not -mcmodel=kernel),
-ffreestanding, and -fno-zero-initialized-bsss, without reverting to
the former practice of resetting KBUILD_CFLAGS.
Purgatory.ro is a standalone binary that is not linked against the
rest of the kernel. Its image is copied into an array that is linked
to the kernel, and from there kexec relocates it wherever it desires.
With the previous change to compiler flags, the error "kexec: Overflow
in relocation type 11 value 0x11fffd000" was encountered when trying
to load the crash kernel. This is from kexec code trying to relocate
the purgatory.ro object.
From the error message, relocation type 11 is R_X86_64_32S. The
x86_64 ABI says:
"The R_X86_64_32 and R_X86_64_32S relocations truncate the
computed value to 32-bits. The linker must verify that the
generated value for the R_X86_64_32 (R_X86_64_32S) relocation
zero-extends (sign-extends) to the original 64-bit value."
This type of relocation doesn't work when kexec chooses to place the
purgatory binary in memory that is not reachable with 32 bit
addresses.
The compiler flag -mcmodel=kernel allows those type of relocations to
be emitted, so revert to using -mcmodel=large as was done before.
Also restore the -ffreestanding and -fno-zero-initialized-bss flags
because they are appropriate for a stand alone piece of object code
which doesn't explicitly zero the bss, and one other report has said
undefined symbols are encountered without -ffreestanding.
These identical compiler flag changes need to happen for every object
that becomes part of the purgatory.ro object, so gather them together
first into PURGATORY_CFLAGS_REMOVE and PURGATORY_CFLAGS, and then
apply them to each of the objects that have C source. Do not apply
any of these flags to kexec-purgatory.o, which is not part of the
standalone object but part of the kernel proper.
Tested-by: Vaibhav Rustagi <vaibhavrustagi@google.com>
Tested-by: Andreas Smas <andreas@lonelycoder.com>
Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: None
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: clang-built-linux@googlegroups.com
Cc: dimitri.sivanich@hpe.com
Cc: mike.travis@hpe.com
Cc: russ.anderson@hpe.com
Fixes: b059f801a9 ("x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS")
Link: https://lkml.kernel.org/r/20190905202346.GA26595@swahl-linux
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- EFI boot fix for signed kernels
- an AC flags fix related to UBSAN
- Hyper-V infinite loop fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hyper-v: Fix overflow bug in fill_gva_list()
x86/uaccess: Don't leak the AC flags into __get_user() argument evaluation
x86/boot: Preserve boot_params.secure_boot from sanitizing
* RZ/G2M based HiHope main board
- Re-enabled accidently disabled SDHI3 (eMMC) support
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE4nzZofWswv9L/nKF189kaWo3T74FAl1w1ksACgkQ189kaWo3
T75WQBAAoPH+Gt2LUEhfya7cVbCEVAX6kkbVh5qSwmaGG3ZF3jBFlCZBaORQh4LA
Ymm/WGWA9hzamNTDeGH0cePKN1QGGNRy4Z9ICxt8+uDiaCFJilKLD3q3wq0NVNRY
SvepE82KYn3CVXL+3pZAi4za2VbJqSSFBWyTrEURXLmOjWdmg0IsARbW5JDgcCwY
NPK4Ohc2GKdIMtGagDZ3EzkeU7f0N2sbyMqSKbe/AXhI3qF8FTiR0Lmj7ik4HAay
UXLV/IPlupN+cTY4QW6PzziTZ1A2drrYigO5H9QFoyvSRyHswiXAN/36QYPx5Lir
i/PH7+x9CxkSM42h1ujLURxdlhUfV6pErSMhcp9BBJRhVhrz0BDRSmuiBOzmN3xi
eDPC/gc66KXv4rTMOYXb12WfT59O6dVXKaGQVYMFWqO3hf2Y6Uo6SWg07JGjvdNQ
Oapi2oJPWVOV2xPZMQuAqTffnUYJekdkLrjrEUUaWV7Gip+3mXILNWiJOGVZ4j8/
Z2/yEYpJdSnhRiaZemvNDqcbR1spnOsxlBQKaWvC2Q2DOUb694Fp1cfAOJ5+XoQA
wlddZYujZj0mnZ557rOQbWOxAxkwdzBl8tgLQ5fnRFrtmqHLTHnOhXiFuz6cPM3e
tfihOemfp3wJUb1QZuELHaAKcTDcwIwhIjqV7jc//sAOIUcG/Fg=
=rBep
-----END PGP SIGNATURE-----
Merge tag 'renesas-fixes2-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into arm/fixes
Second Round of Renesas ARM Based SoC Fixes for v5.3
* RZ/G2M based HiHope main board
- Re-enabled accidently disabled SDHI3 (eMMC) support
* tag 'renesas-fixes2-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: dts: renesas: hihope-common: Fix eMMC status
Link: https://lore.kernel.org/r/cover.1567675986.git.horms+renesas@verge.net.au
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* R-Car D3 (r8a77995) based Draak Board
- Correct backlight regulator name in device tree
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE4nzZofWswv9L/nKF189kaWo3T74FAl1WsAkACgkQ189kaWo3
T76iZA//ST+Llf3DLMcmhFHci2ld8khWl+NcxVAKDjDXgxZ85jxqzxs2b8kshYoV
xAIdt+zwS5GOXpXjDbuw2uufLTEGrmS8v98NxBlxvmdr35PKQE6wXRs+YfisXlhL
bnY4pG8AT3NnN8iiQ4DWkRVmfvc2joaWmyczELvrPCSzLtguCAkjm9m7hfPiVxT+
L5tf01LhYGxRdYoXI/c+kIDyYfgQOhcGMkIk22R3p+l9QM2NE8HqhMPqAqPLYTNA
c/bueDoPxyFdNp7drPMTdkWzovQ0fsslXT1GvbZ6MHJN4e2OurcGHsoU5m/gUJrH
f6C6y3Jw/nInRI13SeI3hAAHnmycND1O5TPqfYMJksmJU2c5x1Zabj2Su4ztNUPa
rMxF4wgqPzqLXDT1xcCacdgUketgYrRmcGUE/KkEIZxcx2eVrLN9TSZ8iOr0ic/E
pwxWUF9OobNYb20sJEiV3esYAwcQq/tqSnr6Hd4vBd2ptqZAOq81mfvt4wnJuwuO
jAAtyV7cdO49lpKf97WfCdqZvArpohKoF6w4Zi9TDBtFGkOj5lkNATc7w6eJkcye
vUDmJ2+9zR+auGMCn3kwzG62yO4KJDnGJgjxMsXh1Kt1MQgpVI2wTFczJUaC4/M2
Lcq+pShfI1qYvAeyy+80RRkR5jiyfGncrrpq1dnYeX6HC8E946w=
=YACQ
-----END PGP SIGNATURE-----
Merge tag 'renesas-fixes-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into arm/fixes
Renesas ARM Based SoC Fixes for v5.3
* R-Car D3 (r8a77995) based Draak Board
- Correct backlight regulator name in device tree
* tag 'renesas-fixes-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: dts: renesas: r8a77995: draak: Fix backlight regulator name
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
When in userspace and MSR FP=0 the hardware FP state is unrelated to
the current process. This is extended for transactions where if tbegin
is run with FP=0, the hardware checkpoint FP state will also be
unrelated to the current process. Due to this, we need to ensure this
hardware checkpoint is updated with the correct state before we enable
FP for this process.
Unfortunately we get this wrong when returning to a process from a
hardware interrupt. A process that starts a transaction with FP=0 can
take an interrupt. When the kernel returns back to that process, we
change to FP=1 but with hardware checkpoint FP state not updated. If
this transaction is then rolled back, the FP registers now contain the
wrong state.
The process looks like this:
Userspace: Kernel
Start userspace
with MSR FP=0 TM=1
< -----
...
tbegin
bne
Hardware interrupt
---- >
<do_IRQ...>
....
ret_from_except
restore_math()
/* sees FP=0 */
restore_fp()
tm_active_with_fp()
/* sees FP=1 (Incorrect) */
load_fp_state()
FP = 0 -> 1
< -----
Return to userspace
with MSR TM=1 FP=1
with junk in the FP TM checkpoint
TM rollback
reads FP junk
When returning from the hardware exception, tm_active_with_fp() is
incorrectly making restore_fp() call load_fp_state() which is setting
FP=1.
The fix is to remove tm_active_with_fp().
tm_active_with_fp() is attempting to handle the case where FP state
has been changed inside a transaction. In this case the checkpointed
and transactional FP state is different and hence we must restore the
FP state (ie. we can't do lazy FP restore inside a transaction that's
used FP). It's safe to remove tm_active_with_fp() as this case is
handled by restore_tm_state(). restore_tm_state() detects if FP has
been using inside a transaction and will set load_fp and call
restore_math() to ensure the FP state (checkpoint and transaction) is
restored.
This is a data integrity problem for the current process as the FP
registers are corrupted. It's also a security problem as the FP
registers from one process may be leaked to another.
Similarly for VMX.
A simple testcase to replicate this will be posted to
tools/testing/selftests/powerpc/tm/tm-poison.c
This fixes CVE-2019-15031.
Fixes: a7771176b4 ("powerpc: Don't enable FP/Altivec if not checkpointed")
Cc: stable@vger.kernel.org # 4.15+
Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190904045529.23002-2-gromero@linux.vnet.ibm.com
When we take an FP unavailable exception in a transaction we have to
account for the hardware FP TM checkpointed registers being
incorrect. In this case for this process we know the current and
checkpointed FP registers must be the same (since FP wasn't used
inside the transaction) hence in the thread_struct we copy the current
FP registers to the checkpointed ones.
This copy is done in tm_reclaim_thread(). We use thread->ckpt_regs.msr
to determine if FP was on when in userspace. thread->ckpt_regs.msr
represents the state of the MSR when exiting userspace. This is setup
by check_if_tm_restore_required().
Unfortunatley there is an optimisation in giveup_all() which returns
early if tsk->thread.regs->msr (via local variable `usermsr`) has
FP=VEC=VSX=SPE=0. This optimisation means that
check_if_tm_restore_required() is not called and hence
thread->ckpt_regs.msr is not updated and will contain an old value.
This can happen if due to load_fp=255 we start a userspace process
with MSR FP=1 and then we are context switched out. In this case
thread->ckpt_regs.msr will contain FP=1. If that same process is then
context switched in and load_fp overflows, MSR will have FP=0. If that
process now enters a transaction and does an FP instruction, the FP
unavailable will not update thread->ckpt_regs.msr (the bug) and MSR
FP=1 will be retained in thread->ckpt_regs.msr. tm_reclaim_thread()
will then not perform the required memcpy and the checkpointed FP regs
in the thread struct will contain the wrong values.
The code path for this happening is:
Userspace: Kernel
Start userspace
with MSR FP/VEC/VSX/SPE=0 TM=1
< -----
...
tbegin
bne
fp instruction
FP unavailable
---- >
fp_unavailable_tm()
tm_reclaim_current()
tm_reclaim_thread()
giveup_all()
return early since FP/VMX/VSX=0
/* ckpt MSR not updated (Incorrect) */
tm_reclaim()
/* thread_struct ckpt FP regs contain junk (OK) */
/* Sees ckpt MSR FP=1 (Incorrect) */
no memcpy() performed
/* thread_struct ckpt FP regs not fixed (Incorrect) */
tm_recheckpoint()
/* Put junk in hardware checkpoint FP regs */
....
< -----
Return to userspace
with MSR TM=1 FP=1
with junk in the FP TM checkpoint
TM rollback
reads FP junk
This is a data integrity problem for the current process as the FP
registers are corrupted. It's also a security problem as the FP
registers from one process may be leaked to another.
This patch moves up check_if_tm_restore_required() in giveup_all() to
ensure thread->ckpt_regs.msr is updated correctly.
A simple testcase to replicate this will be posted to
tools/testing/selftests/powerpc/tm/tm-poison.c
Similarly for VMX.
This fixes CVE-2019-15030.
Fixes: f48e91e87e ("powerpc/tm: Fix FP and VMX register corruption")
Cc: stable@vger.kernel.org # 4.12+
Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190904045529.23002-1-gromero@linux.vnet.ibm.com
When the 'start' parameter is >= 0xFF000000 on 32-bit
systems, or >= 0xFFFFFFFF'FF000000 on 64-bit systems,
fill_gva_list() gets into an infinite loop.
With such inputs, 'cur' overflows after adding HV_TLB_FLUSH_UNIT
and always compares as less than end. Memory is filled with
guest virtual addresses until the system crashes.
Fix this by never incrementing 'cur' to be larger than 'end'.
Reported-by: Jong Hyun Park <park.jonghyun@yonsei.ac.kr>
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 2ffd9e33ce ("x86/hyper-v: Use hypercall for remote TLB flush")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Identical to __put_user(); the __get_user() argument evalution will too
leak UBSAN crud into the __uaccess_begin() / __uaccess_end() region.
While uncommon this was observed to happen for:
drivers/xen/gntdev.c: if (__get_user(old_status, batch->status[i]))
where UBSAN added array bound checking.
This complements commit:
6ae865615f ("x86/uaccess: Dont leak the AC flag into __put_user() argument evaluation")
Tested-by Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: broonie@kernel.org
Cc: sfr@canb.auug.org.au
Cc: akpm@linux-foundation.org
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: mhocko@suse.cz
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20190829082445.GM2369@hirez.programming.kicks-ass.net
Commit
a90118c445 ("x86/boot: Save fields explicitly, zero out everything else")
now zeroes the secure boot setting information (enabled/disabled/...)
passed by the boot loader or by the kernel's EFI handover mechanism.
The problem manifests itself with signed kernels using the EFI handoff
protocol with grub and the kernel loses the information whether secure
boot is enabled in the firmware, i.e., the log message "Secure boot
enabled" becomes "Secure boot could not be determined".
efi_main() arch/x86/boot/compressed/eboot.c sets this field early but it
is subsequently zeroed by the above referenced commit.
Include boot_params.secure_boot in the preserve field list.
[ bp: restructure commit message and massage. ]
Fixes: a90118c445 ("x86/boot: Save fields explicitly, zero out everything else")
Signed-off-by: John S. Gruber <JohnSGruber@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: stable <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/CAPotdmSPExAuQcy9iAHqX3js_fc4mMLQOTr5RBGvizyCOPcTQQ@mail.gmail.com
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Fix the bogus detection of 32bit user mode for uretprobes which
caused corruption of the user return address resulting in
application crashes. In the uprobes handler in_ia32_syscall() is
obviously always returning false on a 64bit kernel. Use
user_64bit_mode() instead which works correctly.
- Prevent large page splitting when ftrace flips RW/RO on the kernel
text which caused iTLB performance issues. Ftrace wants to be
converted to text_poke() which avoids the problem, but for now
allow large page preservation in the static protections check when
the change request spawns a full large page.
- Prevent arch_dynirq_lower_bound() from returning 0 when the IOAPIC
is configured via device tree. In the device tree case the GSI 1:1
mapping is meaningless therefore the lower bound which protects the
GSI range on ACPI machines is irrelevant. Return the lower bound
which the core hands to the function instead of blindly returning 0
which causes the core to allocate the invalid virtual interupt
number 0 which in turn prevents all drivers from allocating and
requesting an interrupt.
- Remove the bogus initialization of LDR and DFR in the 32bit bigsmp
APIC driver. That uses physical destination mode where LDR/DFR are
ignored, but the initialization and the missing clear of LDR caused
the APIC to be left in a inconsistent state on kexec/reboot.
- Clear LDR when clearing the APIC registers so the APIC is in a well
defined state.
- Initialize variables proper in the find_trampoline_placement()
code.
- Silence GCC( build warning for the real mode part of the build"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text
x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning
x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement()
x86/apic: Include the LDR when clearing out APIC registers
x86/apic: Do not initialize LDR and DFR for bigsmp
uprobes/x86: Fix detection of 32-bit user mode
x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines
Pull perf fixes from Thomas Gleixner:
"Two fixes for perf x86 hardware implementations:
- Restrict the period on Nehalem machines to prevent perf from
hogging the CPU
- Prevent the AMD IBS driver from overwriting the hardwre controlled
and pre-seeded reserved bits (0-6) in the count register which
caused a sample bias for dispatched micro-ops"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
perf/x86/intel: Restrict period on Nehalem
- Make exported ftrace function not static
- Fix NULL pointer dereference in reading probes as they are created
- Fix NULL pointer dereference in k/uprobe clean up path
- Various documentation fixes
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXWpTNRQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qpXtAPsGoHDHkgPIyl9bnV0oZfwLrAl4qEyg
RpVp9ZMcG4UtMwEAp/SXRFzvL+EUiKyd1U3FZy2jhVec3+hX7SzIGqgONA4=
=ee8V
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Small fixes and minor cleanups for tracing:
- Make exported ftrace function not static
- Fix NULL pointer dereference in reading probes as they are created
- Fix NULL pointer dereference in k/uprobe clean up path
- Various documentation fixes"
* tag 'trace-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Correct kdoc formats
ftrace/x86: Remove mcount() declaration
tracing/probe: Fix null pointer dereference
tracing: Make exported ftrace_set_clr_event non-static
ftrace: Check for successful allocation of hash
ftrace: Check for empty hash and comment the race with registering probes
ftrace: Fix NULL pointer dereference in t_probe_next()
One significant fix for 32-bit RISC-V systems:
- Fix the RV32 memory map to prevent userspace from corrupting the
FIXMAP area. Without this patch, the system can crash very early
during the boot.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl1p21sACgkQx4+xDQu9
KktWhBAAii1ee/ef0eGQFgeGytOQ+3IhwcqPR0pVmulOXNRsweBmsuTpSt2FEai5
V+EskDg6UyJSXW6jwvVmuH76MPqaNyJFnrFJ7dAJqHqBHyH2Wq2kywe1Z8VeZPhJ
yPwScDV19gyWhD64W8sWsYfEQBoDL+psHdP/2EGfDq+41SWLVWmK6gqZlyMOaWpL
JWLcxJI7QassKezfOX9e5y1IeVrqQcLB9gWVKC1o3RfBLa5DLm9WfHg8XYhdiJ81
WbFNtMcr44wqro8Oc/ESM64ooV1T3+54uJQCf5FjlO5UPdP4xgR9og+Q1PBimEe0
3/csNxhZpxSFixwkfBHDBzq4K/33n3p8f4nBFizca39HuTjkajkFxo84S5xt0iyG
CdSzEvseLztN5S4ov/7glzujAp9VW4GhpNLBFgtn/98Lm5wTYHu4PZf8KvCpqGei
1jqVRBctYrnM/cCzSXpijce/UmjdDIWgLjOhYKOBvXwFURxfaD3k5MytYatsNB+Y
PgRPrazWPl+CS6qWQdONwyFiIfsHB/kHAEfJArE6gHnmpV4uFj0msdbg4dbhRwEw
UgmofqzO6KK3pUDBKQEmOngOv6Fs6D6e6krGqKLBi2jfi61e5L9JXr/nxokLWQNh
+LJq75gCiM8lkEVIoq6E/+dLMtKxZxXk0Jdsj18MGT/y8oaaXCw=
=vL0K
-----END PGP SIGNATURE-----
Merge tag 'riscv/for-v5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fix from Paul Walmsley:
"One significant fix for 32-bit RISC-V systems:
Fix the RV32 memory map to prevent userspace from corrupting the
FIXMAP area. Without this patch, the system can crash very early
during the boot"
* tag 'riscv/for-v5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Fix FIXMAP area corruption on RV32 systems
PPC:
- Fix bug which could leave locks locked in the host on return to a
guest.
x86:
- Prevent infinitely looping emulation of a failing syscall while single
stepping.
- Do not crash the host when nesting is disabled.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEj/xJEkTHTJzZE/FJQP/qGw8qh+gFAl1p0GMACgkQQP/qGw8q
h+hJFQf5ASPp0OdmoRHZwJel/Tb9+MbBrSdLzSVotd6V39M+MktHa8V9xfLR4aBX
ZoZudR3W0Xi8ImqkaEO/RTxB30/wb9iNF3gzb2pCZYYftcbMsuxDop0tYdDSNlHX
QQB6ptIdXVDX8rs9rGvGLpLc/OpjGXVY2ZuHGWfAN+MDLPYwDcr90XC4TkiCLQXI
1cKT7QhxBst3Zw3ZCVHCAXPRJ/Ve6V6/L9lci7ORjR6Lycmljdr41WAexzYsZYrh
z3NbTq643MydXSe6sa7Ohv9yH2I7vTMtb2ILLCawOvAJCxf7Uk2roaXSeTJ7OQvq
Z0EbLVkmjfm3ypqkRSTFtH63qw3wYw==
=G9xs
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"PPC:
- Fix bug which could leave locks held in the host on return to a
guest.
x86:
- Prevent infinitely looping emulation of a failing syscall while
single stepping.
- Do not crash the host when nesting is disabled"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Don't update RIP or do single-step on faulting emulation
KVM: x86: hyper-v: don't crash on KVM_GET_SUPPORTED_HV_CPUID when kvm_intel.nested is disabled
KVM: PPC: Book3S: Fix incorrect guest-to-user-translation error handling
Commit 562e14f722 ("ftrace/x86: Remove mcount support") removed the
support for using mcount, so we could remove the mcount() declaration
to clean up.
Link: http://lkml.kernel.org/r/20190826170150.10f101ba@xhacker.debian
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
- A fix for update_sections_early() to cope with NULL ->mm pointers.
- A correction to the backtrace code to allow proper backtraces.
- Reinforcement of pfn_valid() with PFNs >= 4GiB.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAXWljsvTnkBvkraxkAQKsqxAApF0asWjoPHZaA2rS7h0GWVV5290bluy1
LcT9MyKPNc8K0AyOAhj6m9OnBlzbYOlERklDNrV5uHEprJYjzI5Mp9t5cmwqBOne
hzD9rrFAdhkGP4A0WXeTIPREDIzxMZUgurTwt/zVdm2V5UcwOdrybj4+F7gQGxFl
A1URkGFV3ZqOpSl02t9gV5hyqtzry95qRWgL0JQAlSHnbKcDpoiUvoC7ayAKXKRR
VT9lD/KfuGFSaxi436L8p/TcPlluY+zpTzs+Eambd7Z13kkRB/NlVEBEpvNMnelu
ORZMNSmQp7mnWMfSlJoYV1wCVSIIyf+BNZ4keTTecppOTawph41keDqF6YCTT+sB
yqtKbWgP50ElRZhalHcH2K8jPrQASwZBkQJ0xtT0NecrFJA9AMegege1Jt+54RyZ
X7yfIVt+/T/cYsCJDiPQVgqS/994kz9ifShl8nmQEE9Qg61zh6hFEOkL5/eMt82s
KxSM+jyoPJYrSuuTi7jIQvWLnz+gJ4bo3FpHH+vkvWUDrImi9J4hzhvvakCxAV0U
tudCRLSM3J/gM/zIEJFvMxM1ejC08EbJZWToZ/0/MwvejVSDAgVXEB1/27TTnM6q
6pru83LjQpsKODO3pHQjrwoGByisNeZ0oyaWCL4T0FgwuZ4IXSidgWB4VXpXZnO+
MuK9qjwvYac=
=ec12
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Three fixes for ARM this time around:
- A fix for update_sections_early() to cope with NULL ->mm pointers.
- A correction to the backtrace code to allow proper backtraces.
- Reinforcement of pfn_valid() with PFNs >= 4GiB"
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8901/1: add a criteria for pfn_valid of arm
ARM: 8897/1: check stmfd instruction using right shift
ARM: 8874/1: mm: only adjust sections of valid mm structures
The majority of the fixes this time are for OMAP hardware,
here is a breakdown of the significant changes:
Various device tree bug fixes:
- TI am57xx boards need a voltage level fix to avoid damaging SD cards
- vf610-bk4 fails to detect its flash due to an incorrect description
- meson-g12a USB phy configuration fails
- meson-g12b reboot should not power off the SD card
- Some corrections for apparently harmless differences from the
documentation.
Regression fixes:
- ams-delta FIQ interrupts broke in 5.3
- TI am3/am4 mmc controllers broke in 5.2
The logic_pio driver (used on some Huawei ARM servers) needs a few
bug fixes for reliability.
A couple of compile-time warning fixes
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJdaUf1AAoJEJpsee/mABjZ+7YQALOXvCCfmYKkOSflNKEVXuiZ
tL7uj5PT2E136JoAoyEs8pqXLSpFnC/PxZ7GuN3+ZD0lqVz8PbIn5MhJ9KRrRzSF
lazjW++VQcFt1KR77l2umVi9/KiYD7UXU1HHmWN8+D/PX6EM+Gv1j65Ve8oTRn76
kfsq58y2YC6Rqv9dkiK91mteQ2bdA9b4O33V5M+Idq3aBwyNr5KKihDsNKPSvKl9
ibGmfGnukVcrVtU2reaUxNp2G1OsIKswq2bB0VwUlFMipPxML6rv94dJxDblb2Ns
nq3LeG+1TF9mbAxya2sWaF6fIBpxdEU5llFYRoIknSS+F9qM/nSVsi5WsyJJnCxk
mEvJLhhtt4gH2TZmPvZ6sPWFSVBHDnr8V3F4c0//aTRCN+tV7BCYbf8f3rv/CRNq
MLRsw8gHVPZyUUK9M4afeR3PqEx4/hbU9mpCtduAsiudnA1gtDBfQp8ODMop8aJ1
tCCdbFPoZIKKU/yhUm0OAbykPLVGb9zWGBNwYWuNs6IZFkyksGoFg1AspzKGvrcD
Knywz9dSmiDxRi3qDjVEd/9Rr/CtvUHmbaGq8RlTHmbB7WYoW84UPSD419V4j9vd
eIn4ScejKJCZUDACQsQXh7nnbg+QtnMq+3ODzvjsax2FAEKd8xtaZElSs8OA0U6b
xdqEuHPNY8VWBFXwAfdp
=mV+7
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"The majority of the fixes this time are for OMAP hardware, here is a
breakdown of the significant changes:
Various device tree bug fixes:
- TI am57xx boards need a voltage level fix to avoid damaging SD
cards
- vf610-bk4 fails to detect its flash due to an incorrect description
- meson-g12a USB phy configuration fails
- meson-g12b reboot should not power off the SD card
- Some corrections for apparently harmless differences from the
documentation.
Regression fixes:
- ams-delta FIQ interrupts broke in 5.3
- TI am3/am4 mmc controllers broke in 5.2
The logic_pio driver (used on some Huawei ARM servers) got a few bug
fixes for reliability.
And a couple of compile-time warning fixes"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (26 commits)
soc: ixp4xx: Protect IXP4xx SoC drivers by ARCH_IXP4XX || COMPILE_TEST
soc: ti: pm33xx: Make two symbols static
soc: ti: pm33xx: Fix static checker warnings
ARM: OMAP: dma: Mark expected switch fall-throughs
ARM: dts: Fix incomplete dts data for am3 and am4 mmc
bus: ti-sysc: Simplify cleanup upon failures in sysc_probe()
ARM: OMAP1: ams-delta-fiq: Fix missing irq_ack
ARM: dts: dra74x: Fix iodelay configuration for mmc3
ARM: dts: am335x: Fix UARTs length
ARM: OMAP2+: Fix omap4 errata warning on other SoCs
bus: hisi_lpc: Add .remove method to avoid driver unbind crash
bus: hisi_lpc: Unregister logical PIO range to avoid potential use-after-free
lib: logic_pio: Add logic_pio_unregister_range()
lib: logic_pio: Avoid possible overlap for unregistering regions
lib: logic_pio: Fix RCU usage
arm64: dts: amlogic: odroid-n2: keep SD card regulator always on
arm64: dts: meson-g12a-sei510: enable IR controller
arm64: dts: meson-g12a: add missing dwc2 phy-names
ARM: dts: vf610-bk4: Fix qspi node description
ARM: dts: Fix incorrect dcan register mapping for am3, am4 and dra7
...
When counting dispatched micro-ops with cnt_ctl=1, in order to prevent
sample bias, IBS hardware preloads the least significant 7 bits of
current count (IbsOpCurCnt) with random values, such that, after the
interrupt is handled and counting resumes, the next sample taken
will be slightly perturbed.
The current count bitfield is in the IBS execution control h/w register,
alongside the maximum count field.
Currently, the IBS driver writes that register with the maximum count,
leaving zeroes to fill the current count field, thereby overwriting
the random bits the hardware preloaded for itself.
Fix the driver to actually retain and carry those random bits from the
read of the IBS control register, through to its write, instead of
overwriting the lower current count bits with zeroes.
Tested with:
perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0 <workload>
'perf annotate' output before:
15.70 65: addsd %xmm0,%xmm1
17.30 add $0x1,%rax
15.88 cmp %rdx,%rax
je 82
17.32 72: test $0x1,%al
jne 7c
7.52 movapd %xmm1,%xmm0
5.90 jmp 65
8.23 7c: sqrtsd %xmm1,%xmm0
12.15 jmp 65
'perf annotate' output after:
16.63 65: addsd %xmm0,%xmm1
16.82 add $0x1,%rax
16.81 cmp %rdx,%rax
je 82
16.69 72: test $0x1,%al
jne 7c
8.30 movapd %xmm1,%xmm0
8.13 jmp 65
8.24 7c: sqrtsd %xmm1,%xmm0
8.39 jmp 65
Tested on Family 15h and 17h machines.
Machines prior to family 10h Rev. C don't have the RDWROPCNT capability,
and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't
affect their operation.
It is unknown why commit db98c5faf8 ("perf/x86: Implement 64-bit
counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt
field; the number of preloaded random bits has always been 7, AFAICT.
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Arnaldo Carvalho de Melo" <acme@kernel.org>
Cc: <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Borislav Petkov" <bp@alien8.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: "Namhyung Kim" <namhyung@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20190826195730.30614-1-kim.phillips@amd.com
ftrace does not use text_poke() for enabling trace functionality. It uses
its own mechanism and flips the whole kernel text to RW and back to RO.
The CPA rework removed a loop based check of 4k pages which tried to
preserve a large page by checking each 4k page whether the change would
actually cover all pages in the large page.
This resulted in endless loops for nothing as in testing it turned out that
it actually never preserved anything. Of course testing missed to include
ftrace, which is the one and only case which benefitted from the 4k loop.
As a consequence enabling function tracing or ftrace based kprobes results
in a full 4k split of the kernel text, which affects iTLB performance.
The kernel RO protection is the only valid case where this can actually
preserve large pages.
All other static protections (RO data, data NX, PCI, BIOS) are truly
static. So a conflict with those protections which results in a split
should only ever happen when a change of memory next to a protected region
is attempted. But these conflicts are rightfully splitting the large page
to preserve the protected regions. In fact a change to the protected
regions itself is a bug and is warned about.
Add an exception for the static protection check for kernel text RO when
the to be changed region spawns a full large page which allows to preserve
the large mappings. This also prevents the syslog to be spammed about CPA
violations when ftrace is used.
The exception needs to be removed once ftrace switched over to text_poke()
which avoids the whole issue.
Fixes: 585948f4f6 ("x86/mm/cpa: Avoid the 4k pages check completely")
Reported-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de
Hi Linus,
Please, pull the following patches that mark switch cases where we are
expecting to fall through.
- Fix fall-through warnings on arc and nds32 for multiple
configurations.
Thanks
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAl1n+OEACgkQRwW0y0cG
2zGqJw//TmX+aoIeSe04rpzGr+MRzWHsv/NnlA6usaqD9k3ICfNwlQPv/jYdjUg3
UBE1WHmDAdrjfCq2+gxjQEsbVmMFM5tfujXyA3dMfsDsit6Y0V3XmHiIQIyc3vNF
A2XAGlymh3uTIynPsOW9tThc5fNT5UBTqRh6Mm/0Xkr3IYgHLu66pDKNLpW/4sFA
jfg3lTp0vBlh4wXbSkrkKnWon4qULJGo4uSwMiOL66zqpXIyatml/MMgJJ9USchO
AH8LYtN0ldwtlaLWmvY0qbwrnOXWu6UwYhys8P53BSdnKGb/zJ5qDF1h8pJdjt9K
3vMbt11+nGa46YeYpxI/BwR2e3F/g313JpfM0rSI+nu9jPMStX7B3o51DYe4D77o
FEvVhxnMajQz8pz0/83bI8NlTjeApSNgywyGonZ/+WeoUQUX1C5SisBPXcxxMY0f
NDm7a8ty7BjZbmDip6a7LrHj1+dEKn1y8HDpTx96z4Q4vNb0YU7NpdcUvHf5VIdl
mW3izy63MmP6YSvkdHbj07/PpIwMy3Wd18BqghxS0xpi3Cs4rik3cuaVhbyVLVlt
FC7letPkHVhR+X6QVZ71ke7Ia//imJGYgew1/iKxSX7umMYsqUL+oTspAym0xndc
/sbD7YjuaApt8K23fkWavulL08OPNwf0YPrsPD6NdnuZNPhRvi8=
=L0Bt
-----END PGP SIGNATURE-----
Merge tag 'Wimplicit-fallthrough-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull fallthrough fixes from Gustavo A. R. Silva:
"Fix fall-through warnings on arc and nds32 for multiple
configurations"
* tag 'Wimplicit-fallthrough-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
nds32: Mark expected switch fall-throughs
ARC: unwind: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
This patch fixes the following warnings (Building: allmodconfig nds32):
include/math-emu/soft-fp.h:124:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/nds32/kernel/signal.c:362:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
arch/nds32/kernel/signal.c:315:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:417:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:430:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/soft-fp.h:124:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:417:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:430:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
This patch fixes the following warnings (Building: haps_hs_defconfig arc):
arch/arc/kernel/unwind.c: In function ‘read_pointer’:
./include/linux/compiler.h:328:5: warning: this statement may fall through [-Wimplicit-fallthrough=]
do { \
^
./include/linux/compiler.h:338:2: note: in expansion of macro ‘__compiletime_assert’
__compiletime_assert(condition, msg, prefix, suffix)
^~~~~~~~~~~~~~~~~~~~
./include/linux/compiler.h:350:2: note: in expansion of macro ‘_compiletime_assert’
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
^~~~~~~~~~~~~~~~
arch/arc/kernel/unwind.c:573:3: note: in expansion of macro ‘BUILD_BUG_ON’
BUILD_BUG_ON(sizeof(u32) != sizeof(value));
^~~~~~~~~~~~
arch/arc/kernel/unwind.c:575:2: note: here
case DW_EH_PE_native:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
pfn_valid can be wrong when parsing a invalid pfn whose phys address
exceeds BITS_PER_LONG as the MSB will be trimed when shifted.
The issue originally arise from bellowing call stack, which corresponding to
an access of the /proc/kpageflags from userspace with a invalid pfn parameter
and leads to kernel panic.
[46886.723249] c7 [<c031ff98>] (stable_page_flags) from [<c03203f8>]
[46886.723264] c7 [<c0320368>] (kpageflags_read) from [<c0312030>]
[46886.723280] c7 [<c0311fb0>] (proc_reg_read) from [<c02a6e6c>]
[46886.723290] c7 [<c02a6e24>] (__vfs_read) from [<c02a7018>]
[46886.723301] c7 [<c02a6f74>] (vfs_read) from [<c02a778c>]
[46886.723315] c7 [<c02a770c>] (SyS_pread64) from [<c0108620>]
(ret_fast_syscall+0x0/0x28)
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Currently, various virtual memory areas of Linux RISC-V are organized
in increasing order of their virtual addresses is as follows:
1. User space area (This is lowest area and starts at 0x0)
2. FIXMAP area
3. VMALLOC area
4. Kernel area (This is highest area and starts at PAGE_OFFSET)
The maximum size of user space aread is represented by TASK_SIZE.
On RV32 systems, TASK_SIZE is defined as VMALLOC_START which causes the
user space area to overlap the FIXMAP area. This allows user space apps
to potentially corrupt the FIXMAP area and kernel OF APIs will crash
whenever they access corrupted FDT in the FIXMAP area.
On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
happen to overlap so we don't see any FIXMAP area corruptions.
This patch fixes FIXMAP area corruption on RV32 systems by setting
TASK_SIZE to FIXADDR_START. We also move FIXADDR_TOP, FIXADDR_SIZE,
and FIXADDR_START defines to asm/pgtable.h so that we can avoid cyclic
header includes.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Tested-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
One of the very few warnings I have in the current build comes from
arch/x86/boot/edd.c, where I get the following with a gcc9 build:
arch/x86/boot/edd.c: In function ‘query_edd’:
arch/x86/boot/edd.c:148:11: warning: taking address of packed member of ‘struct boot_params’ may result in an unaligned pointer value [-Waddress-of-packed-member]
148 | mbrptr = boot_params.edd_mbr_sig_buffer;
| ^~~~~~~~~~~
This warning triggers because we throw away all the CFLAGS and then make
a new set for REALMODE_CFLAGS, so the -Wno-address-of-packed-member we
added in the following commit is not present:
6f303d6053 ("gcc-9: silence 'address-of-packed-member' warning")
The simplest solution for now is to adjust the warning for this version
of CFLAGS as well, but it would definitely make sense to examine whether
REALMODE_CFLAGS could be derived from CFLAGS, so that it picks up changes
in the compiler flags environment automatically.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Don't advance RIP or inject a single-step #DB if emulation signals a
fault. This logic applies to all state updates that are conditional on
clean retirement of the emulation instruction, e.g. updating RFLAGS was
previously handled by commit 38827dbd3f ("KVM: x86: Do not update
EFLAGS on faulting emulation").
Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with
ctxt->_eip until emulation "retires" anyways. Skipping #DB injection
fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to
invalid state with EFLAGS.TF=1 would loop indefinitely due to emulation
overwriting the #UD with #DB and thus restarting the bad SYSCALL over
and over.
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@kernel.org>
Fixes: 663f4c61b8 ("KVM: x86: handle singlestep during emulation")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
If kvm_intel is loaded with nested=0 parameter an attempt to perform
KVM_GET_SUPPORTED_HV_CPUID results in OOPS as nested_get_evmcs_version hook
in kvm_x86_ops is NULL (we assign it in nested_vmx_hardware_setup() and
this only happens in case nested is enabled).
Check that kvm_x86_ops->nested_get_evmcs_version is not NULL before
calling it. With this, we can remove the stub from svm as it is no
longer needed.
Cc: <stable@vger.kernel.org>
Fixes: e2e871ab2f ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
- Support for Edge Triggered IRQs in ARC IDU intc
- other fixes here and there
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJdZWFxAAoJEGnX8d3iisJe1/QP/1QlB6bDp36ONuc0wgtvyZhS
/KDfgwyLK89WiH/lc2AgPL6BkFaOBSqpNe9PS8IdjRscGMJFaXnfifKBl2eX/sM3
4nkiIjAb9Fl4dLdaPs/51p+wvHlkdD9pzI5SYJl2IeNCZRNjjixBlaF8fezONtlu
2yuzmikeggcT7NZGnZ5IQGj6CWRm7Drb5J4mfmZu3HJ+BJOnXZpdza3q3WduT3DC
6tUA/xtUXq8sGpylXL2MgA34SbgjBDmxW8Kv32sQp6mipGJwq4jF4+n8rxF/znCe
6ILiqOwp7CjEHmpYTn2cxMC5FTP0BuvnLh/ECEFKUWgIH4/A3zy/RJOKhbZ0P0rV
+vraRvdjOA2/0P6Y1A+cGGYP2c3HwmSgHmtXwd/QRfesX2/Y7jhMlEOXZ9H2K6CC
zTqobUWQ4tFprz1P0H6p1h7Z/tJv/q4TNMZR5tcQyjwT6i7Sw+ReffTnwpPMr92V
GAZu6sahsJCOqRqk0MfaZVa54r+UlE8bbapGZo+7fZ9+UVrxLKgWwfnYbe/6eSHX
osddo3zoLuBrgq2gt/ZMseeQRdRYeH8p/3jgnEws2G/uen7GjAw9m0c3Yrs+ibVS
oNp3DNk8wkzgrLgC7xXhBkwyok85SEoCfZoQg96DXo365G0YyHZyHCI2HzIAP4oy
wtRcqnsQgEtvV1s7RiTU
=CJKr
-----END PGP SIGNATURE-----
Merge tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC updates from Vineet Gupta:
- support for Edge Triggered IRQs in ARC IDU intc
- other fixes here and there
* tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
arc: prefer __section from compiler_attributes.h
dt-bindings: IDU-intc: Add support for edge-triggered interrupts
dt-bindings: IDU-intc: Clean up documentation
ARCv2: IDU-intc: Add support for edge-triggered interrupts
ARC: unwind: Mark expected switch fall-throughs
ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations
ARC: fix typo in setup_dma_ops log message
ARCv2: entry: early return from exception need not clear U & DE bits
Pull networking fixes from David Miller:
1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya
Leoshkevich.
2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for
SMC from Jason Baron.
3) ipv6_mc_may_pull() should return 0 for malformed packets, not
-EINVAL. From Stefano Brivio.
4) Don't forget to unpin umem xdp pages in error path of
xdp_umem_reg(). From Ivan Khoronzhuk.
5) Fix sta object leak in mac80211, from Johannes Berg.
6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2
switches. From Florian Fainelli.
7) Revert DMA sync removal from r8169 which was causing regressions on
some MIPS Loongson platforms. From Heiner Kallweit.
8) Use after free in flow dissector, from Jakub Sitnicki.
9) Fix NULL derefs of net devices during ICMP processing across
collect_md tunnels, from Hangbin Liu.
10) proto_register() memory leaks, from Zhang Lin.
11) Set NLM_F_MULTI flag in multipart netlink messages consistently,
from John Fastabend.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
r8152: Set memory to all 0xFFs on failed reg reads
openvswitch: Fix conntrack cache with timeout
ipv4: mpls: fix mpls_xmit for iptunnel
nexthop: Fix nexthop_num_path for blackhole nexthops
net: rds: add service level support in rds-info
net: route dump netlink NLM_F_MULTI flag missing
s390/qeth: reject oversized SNMP requests
sock: fix potential memory leak in proto_register()
MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT
xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode
ipv4/icmp: fix rt dst dev null pointer dereference
openvswitch: Fix log message in ovs conntrack
bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0
bpf: fix use after free in prog symbol exposure
bpf: fix precision tracking in presence of bpf2bpf calls
flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH
Revert "r8169: remove not needed call to dma_sync_single_for_device"
ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev
net/ncsi: Fix the payload copying for the request coming from Netlink
qed: Add cleanup in qed_slowpath_start()
...
Gustavo noticed that 'new' can be left uninitialized if 'bios_start'
happens to be less or equal to 'entry->addr + entry->size'.
Initialize the variable at the begin of the iteration to the current value
of 'bios_start'.
Fixes: 0a46fff2f9 ("x86/boot/compressed/64: Fix boot on machines with broken E820 table")
Reported-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190826133326.7cxb4vbmiawffv2r@box
H_PUT_TCE_INDIRECT handlers receive a page with up to 512 TCEs from
a guest. Although we verify correctness of TCEs before we do anything
with the existing tables, there is a small window when a check in
kvmppc_tce_validate might pass and right after that the guest alters
the page of TCEs, causing an early exit from the handler and leaving
srcu_read_lock(&vcpu->kvm->srcu) (virtual mode) or lock_rmap(rmap)
(real mode) locked.
This fixes the bug by jumping to the common exit code with an appropriate
unlock.
Cc: stable@vger.kernel.org # v4.11+
Fixes: 121f80ba68 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>