Both the callers in the low level ASM code and __context_tracking_exit()
which is invoked from enter_from_user_mode() via user_exit_irqoff() are
marked NOKPROBE. Allowing enter_from_user_mode() to be probed is
inconsistent at best.
Aside of that while function tracing per se is safe the function trace
entry/exit points can be used via BPF as well which is not safe to use
before context tracking has reached CONTEXT_KERNEL and adjusted RCU.
Mark it noinstr which moves it into the instrumentation protected text
section and includes notrace.
Note, this needs further fixups in context tracking to ensure that the
full call chain is protected. Will be addressed in follow up changes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.429059405@linutronix.de
All ASM code which is not part of the entry functionality can move out into
the .text section. No reason to keep it in the non-instrumentable entry
section.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.320164650@linutronix.de
All ASM code which is not part of the entry functionality can move out into
the .text section. No reason to keep it in the non-instrumentable entry
section.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.227579223@linutronix.de
The sanitizers are not really applicable to the fragile low level entry
code. Entry code needs to carefully setup a normal 'runtime' environment.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200505134059.970057117@linutronix.de
No users left since two years due to commit 21d375b6b3 ("x86/entry/64:
Remove the SYSCALL64 fast path")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134059.061301403@linutronix.de
GAS cannot optimize out the test and conditional jump when context tracking
is disabled and CALL_enter_from_user_mode is an empty macro.
Wrap it in #ifdeffery. Will go away once all this is moved to C.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134058.955968069@linutronix.de
The label .Lcommon_\sym was introduced by 39e9543344.
(x86-64: Reduce amount of redundant code generated for invalidate_interruptNN)
And all the other relevant information was removed by 52aec3308d
(x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR)
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200419144049.1906-4-laijs@linux.alibaba.com
- Move the arch-specific code into arch/arm64/kvm
- Start the post-32bit cleanup
- Cherry-pick a few non-invasive pre-NV patches
x86:
- Rework of TLB flushing
- Rework of event injection, especially with respect to nested virtualization
- Nested AMD event injection facelift, building on the rework of generic code
and fixing a lot of corner cases
- Nested AMD live migration support
- Optimization for TSC deadline MSR writes and IPIs
- Various cleanups
- Asynchronous page fault cleanups (from tglx, common topic branch with tip tree)
- Interrupt-based delivery of asynchronous "page ready" events (host side)
- Hyper-V MSRs and hypercalls for guest debugging
- VMX preemption timer fixes
s390:
- Cleanups
Generic:
- switch vCPU thread wakeup from swait to rcuwait
The other architectures, and the guest side of the asynchronous page fault
work, will come next week.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7VJcYUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPf6QgAq4wU5wdd1lTGz/i3DIhNVJNJgJlp
ozLzRdMaJbdbn5RpAK6PEBd9+pt3+UlojpFB3gpJh2Nazv2OzV4yLQgXXXyyMEx1
5Hg7b4UCJYDrbkCiegNRv7f/4FWDkQ9dx++RZITIbxeskBBCEI+I7GnmZhGWzuC4
7kj4ytuKAySF2OEJu0VQF6u0CvrNYfYbQIRKBXjtOwuRK4Q6L63FGMJpYo159MBQ
asg3B1jB5TcuGZ9zrjL5LkuzaP4qZZHIRs+4kZsH9I6MODHGUxKonrkablfKxyKy
CFK+iaHCuEXXty5K0VmWM3nrTfvpEjVjbMc7e1QGBQ5oXsDM0pqn84syRg==
=v7Wn
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Move the arch-specific code into arch/arm64/kvm
- Start the post-32bit cleanup
- Cherry-pick a few non-invasive pre-NV patches
x86:
- Rework of TLB flushing
- Rework of event injection, especially with respect to nested
virtualization
- Nested AMD event injection facelift, building on the rework of
generic code and fixing a lot of corner cases
- Nested AMD live migration support
- Optimization for TSC deadline MSR writes and IPIs
- Various cleanups
- Asynchronous page fault cleanups (from tglx, common topic branch
with tip tree)
- Interrupt-based delivery of asynchronous "page ready" events (host
side)
- Hyper-V MSRs and hypercalls for guest debugging
- VMX preemption timer fixes
s390:
- Cleanups
Generic:
- switch vCPU thread wakeup from swait to rcuwait
The other architectures, and the guest side of the asynchronous page
fault work, will come next week"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (256 commits)
KVM: selftests: fix rdtsc() for vmx_tsc_adjust_test
KVM: check userspace_addr for all memslots
KVM: selftests: update hyperv_cpuid with SynDBG tests
x86/kvm/hyper-v: Add support for synthetic debugger via hypercalls
x86/kvm/hyper-v: enable hypercalls regardless of hypercall page
x86/kvm/hyper-v: Add support for synthetic debugger interface
x86/hyper-v: Add synthetic debugger definitions
KVM: selftests: VMX preemption timer migration test
KVM: nVMX: Fix VMX preemption timer migration
x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit
KVM: x86/pmu: Support full width counting
KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in
KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT
KVM: x86: acknowledgment mechanism for async pf page ready notifications
KVM: x86: interrupt based APF 'page ready' event delivery
KVM: introduce kvm_read_guest_offset_cached()
KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present()
KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info
Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously"
KVM: VMX: Replace zero-length array with flexible-array
...
Pull vfs updates from Al Viro:
"Assorted patches from Miklos.
An interesting part here is /proc/mounts stuff..."
The "/proc/mounts stuff" is using a cursor for keeeping the location
data while traversing the mount listing.
Also probably worth noting is the addition of faccessat2(), which takes
an additional set of flags to specify how the lookup is done
(AT_EACCESS, AT_SYMLINK_NOFOLLOW, AT_EMPTY_PATH).
* 'from-miklos' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: add faccessat2 syscall
vfs: don't parse "silent" option
vfs: don't parse "posixacl" option
vfs: don't parse forbidden flags
statx: add mount_root
statx: add mount ID
statx: don't clear STATX_ATIME on SB_RDONLY
uapi: deprecate STATX_ALL
utimensat: AT_EMPTY_PATH support
vfs: split out access_override_creds()
proc/mounts: add cursor
aio: fix async fsync creds
vfs: allow unprivileged whiteout creation
- Branch Target Identification (BTI)
* Support for ARMv8.5-BTI in both user- and kernel-space. This
allows branch targets to limit the types of branch from which
they can be called and additionally prevents branching to
arbitrary code, although kernel support requires a very recent
toolchain.
* Function annotation via SYM_FUNC_START() so that assembly
functions are wrapped with the relevant "landing pad"
instructions.
* BPF and vDSO updates to use the new instructions.
* Addition of a new HWCAP and exposure of BTI capability to
userspace via ID register emulation, along with ELF loader
support for the BTI feature in .note.gnu.property.
* Non-critical fixes to CFI unwind annotations in the sigreturn
trampoline.
- Shadow Call Stack (SCS)
* Support for Clang's Shadow Call Stack feature, which reserves
platform register x18 to point at a separate stack for each
task that holds only return addresses. This protects function
return control flow from buffer overruns on the main stack.
* Save/restore of x18 across problematic boundaries (user-mode,
hypervisor, EFI, suspend, etc).
* Core support for SCS, should other architectures want to use it
too.
* SCS overflow checking on context-switch as part of the existing
stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
- CPU feature detection
* Removed numerous "SANITY CHECK" errors when running on a system
with mismatched AArch32 support at EL1. This is primarily a
concern for KVM, which disabled support for 32-bit guests on
such a system.
* Addition of new ID registers and fields as the architecture has
been extended.
- Perf and PMU drivers
* Minor fixes and cleanups to system PMU drivers.
- Hardware errata
* Unify KVM workarounds for VHE and nVHE configurations.
* Sort vendor errata entries in Kconfig.
- Secure Monitor Call Calling Convention (SMCCC)
* Update to the latest specification from Arm (v1.2).
* Allow PSCI code to query the SMCCC version.
- Software Delegated Exception Interface (SDEI)
* Unexport a bunch of unused symbols.
* Minor fixes to handling of firmware data.
- Pointer authentication
* Add support for dumping the kernel PAC mask in vmcoreinfo so
that the stack can be unwound by tools such as kdump.
* Simplification of key initialisation during CPU bringup.
- BPF backend
* Improve immediate generation for logical and add/sub
instructions.
- vDSO
- Minor fixes to the linker flags for consistency with other
architectures and support for LLVM's unwinder.
- Clean up logic to initialise and map the vDSO into userspace.
- ACPI
- Work around for an ambiguity in the IORT specification relating
to the "num_ids" field.
- Support _DMA method for all named components rather than only
PCIe root complexes.
- Minor other IORT-related fixes.
- Miscellaneous
* Initialise debug traps early for KGDB and fix KDB cacheflushing
deadlock.
* Minor tweaks to early boot state (documentation update, set
TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
* Refactoring and cleanup
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl7U9csQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNLBHCACs/YU4SM7Om5f+7QnxIKao5DBr2CnGGvdC
yTfDghFDTLQVv3MufLlfno3yBe5G8sQpcZfcc+hewfcGoMzVZXu8s7LzH6VSn9T9
jmT3KjDMrg0RjSHzyumJp2McyelTk0a4FiKArSIIKsJSXUyb1uPSgm7SvKVDwEwU
JGDzL9IGilmq59GiXfDzGhTZgmC37QdwRoRxDuqtqWQe5CHoRXYexg87HwBKOQxx
HgU9L7ehri4MRZfpyjaDrr6quJo3TVnAAKXNBh3mZAskVS9ZrfKpEH0kYWYuqybv
znKyHRecl/rrGePV8RTMtrwnSdU26zMXE/omsVVauDfG9hqzqm+Q
=w3qi
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"A sizeable pile of arm64 updates for 5.8.
Summary below, but the big two features are support for Branch Target
Identification and Clang's Shadow Call stack. The latter is currently
arm64-only, but the high-level parts are all in core code so it could
easily be adopted by other architectures pending toolchain support
Branch Target Identification (BTI):
- Support for ARMv8.5-BTI in both user- and kernel-space. This allows
branch targets to limit the types of branch from which they can be
called and additionally prevents branching to arbitrary code,
although kernel support requires a very recent toolchain.
- Function annotation via SYM_FUNC_START() so that assembly functions
are wrapped with the relevant "landing pad" instructions.
- BPF and vDSO updates to use the new instructions.
- Addition of a new HWCAP and exposure of BTI capability to userspace
via ID register emulation, along with ELF loader support for the
BTI feature in .note.gnu.property.
- Non-critical fixes to CFI unwind annotations in the sigreturn
trampoline.
Shadow Call Stack (SCS):
- Support for Clang's Shadow Call Stack feature, which reserves
platform register x18 to point at a separate stack for each task
that holds only return addresses. This protects function return
control flow from buffer overruns on the main stack.
- Save/restore of x18 across problematic boundaries (user-mode,
hypervisor, EFI, suspend, etc).
- Core support for SCS, should other architectures want to use it
too.
- SCS overflow checking on context-switch as part of the existing
stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
CPU feature detection:
- Removed numerous "SANITY CHECK" errors when running on a system
with mismatched AArch32 support at EL1. This is primarily a concern
for KVM, which disabled support for 32-bit guests on such a system.
- Addition of new ID registers and fields as the architecture has
been extended.
Perf and PMU drivers:
- Minor fixes and cleanups to system PMU drivers.
Hardware errata:
- Unify KVM workarounds for VHE and nVHE configurations.
- Sort vendor errata entries in Kconfig.
Secure Monitor Call Calling Convention (SMCCC):
- Update to the latest specification from Arm (v1.2).
- Allow PSCI code to query the SMCCC version.
Software Delegated Exception Interface (SDEI):
- Unexport a bunch of unused symbols.
- Minor fixes to handling of firmware data.
Pointer authentication:
- Add support for dumping the kernel PAC mask in vmcoreinfo so that
the stack can be unwound by tools such as kdump.
- Simplification of key initialisation during CPU bringup.
BPF backend:
- Improve immediate generation for logical and add/sub instructions.
vDSO:
- Minor fixes to the linker flags for consistency with other
architectures and support for LLVM's unwinder.
- Clean up logic to initialise and map the vDSO into userspace.
ACPI:
- Work around for an ambiguity in the IORT specification relating to
the "num_ids" field.
- Support _DMA method for all named components rather than only PCIe
root complexes.
- Minor other IORT-related fixes.
Miscellaneous:
- Initialise debug traps early for KGDB and fix KDB cacheflushing
deadlock.
- Minor tweaks to early boot state (documentation update, set
TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
- Refactoring and cleanup"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h
KVM: arm64: Check advertised Stage-2 page size capability
arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()
ACPI/IORT: Remove the unused __get_pci_rid()
arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context
arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register
arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register
arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register
arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register
arm64/cpufeature: Add remaining feature bits in ID_PFR0 register
arm64/cpufeature: Introduce ID_MMFR5 CPU register
arm64/cpufeature: Introduce ID_DFR1 CPU register
arm64/cpufeature: Introduce ID_PFR2 CPU register
arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0
arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register
arm64: mm: Add asid_gen_match() helper
firmware: smccc: Fix missing prototype warning for arm_smccc_version_init
arm64: vdso: Fix CFI directives in sigreturn trampoline
arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
...
functionality intended.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl7VNX4RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1g43A/+K+TCmm8+G0DSL5JSHiI93J9yu9ac3yEU
4V9eOxcrQVEPqZUEgGNl8yucMXsTj+trT1J0ZKygoVYzpzFSsJzeyQ97CfNa25x4
AIKrVewkSBtLS4Fof1jfSgapWlY54OldMWfLNXInMPxekD0gCRhIp2hmidxwZouX
fyMsZGw9YjEPNfzHDjfADymRLOVJHG3rpd8hjrbNLblMR+xaleLHezFwn7+6PgXl
FaENy3MVubziTOWr5AT39xG3zKide1boeDI/eszD1pFu4DeBc5/7u8tYglhqGj/i
qCoojXUJxxEK/NRFO0zSXKG9vb1ZLKERRFmPbD4xbfgPPKMQRFGf2JcSfF6HuK/o
reay1MWMIapD2E3TSoJAcLaKIk/Z8nEzVXhff3bmU5Zskbhprgqz/8LblyfNdJZ3
SlnnQxpfnc+Up36EU6yk42Dy2x9IW7Ew04rWVuWzF7VixbVlKfK8MpjNSAhyduuO
6rs0YnIW2PIt7cjskrT5HEAvUVFzd2EaY327+L9fb56Mrb3fzg1T2ihVnzAs9r2s
GoYuPL9uFnHZS19MclRq8In7dFviypeL9IX9FcBCaGuqGlWdSIahLW8OyT9tOqIw
Wn7bpSHz8GM9OZIBs3u6PDE7qwPQkRTFoJzt8H5PtcUQbZSwOOOV4CjvCG4xQ14c
j+xSwhXUpxg=
=Qrqk
-----END PGP SIGNATURE-----
Merge tag 'x86-vdso-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso updates from Ingo Molnar:
"Clean up various aspects of the vDSO code, no change in functionality
intended"
* tag 'x86-vdso-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso/Makefile: Add vobjs32
x86/vdso/vdso2c: Convert iterators to unsigned
x86/vdso/vdso2c: Correct error messages on file open
KVM overloads #PF to indicate two types of not-actually-page-fault
events. Right now, the KVM guest code intercepts them by modifying
the IDT and hooking the #PF vector. This makes the already fragile
fault code even harder to understand, and it also pollutes call
traces with async_page_fault and do_async_page_fault for normal page
faults.
Clean it up by moving the logic into do_page_fault() using a static
branch. This gets rid of the platform trap_init override mechanism
completely.
[ tglx: Fixed up 32bit, removed error code from the async functions and
massaged coding style ]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134059.169270470@linutronix.de
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl7BzV8eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGg8EH/A2pXMTxtc96RI4S
sttEsUQqbakFS0Z/2tQPpMGr/qW2e5eHgsTX/a3SiUeZiIXk6f4lMFkMuctzBf7p
X77cNEDwGOEdbtCXTsMcmKSde7sP2zCXsPB8xTWLyE6rnaFRgikwwkeqgkIKhp1h
bvOQV0t9HNGvxGAM0iZeOvQAvFl4vd7nS123/MYbir9cugfQUSJRueQ4BiCiJqVE
6cNA7/vFzDJuFGszzIrJ7HXn/IdQMMWHkvTDjgBw0GZw1mDbGFbfbZwOeTz1ojCt
smUQ4tIFxBa/VA5zx7dOy2P2keHbSVf4VLkZRPcceT7OqVS65ETmFDp+qt5NdWM5
vZ8+7/0=
=CyYH
-----END PGP SIGNATURE-----
Merge tag 'v5.7-rc6' into objtool/core, to pick up fixes and resolve semantic conflict
Resolve structural conflict between:
59566b0b62: ("x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up")
which introduced a new reference to 'ftrace_epilogue', and:
0298739b79: ("x86,ftrace: Fix ftrace_regs_caller() unwind")
Which renamed it to 'ftrace_caller_end'. Rename the new usage site in the merge commit.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
POSIX defines faccessat() as having a fourth "flags" argument, while the
linux syscall doesn't have it. Glibc tries to emulate AT_EACCESS and
AT_SYMLINK_NOFOLLOW, but AT_EACCESS emulation is broken.
Add a new faccessat(2) syscall with the added flags argument and implement
both flags.
The value of AT_EACCESS is defined in glibc headers to be the same as
AT_REMOVEDIR. Use this value for the kernel interface as well, together
with the explanatory comment.
Also add AT_EMPTY_PATH support, which is not documented by POSIX, but can
be useful and is trivial to implement.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
In order to change the {JMP,CALL}_NOSPEC macros to call out-of-line
versions of the retpoline magic, we need to remove the '%' from the
argument, such that we can paste it onto symbol names.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20200428191700.151623523@infradead.org
The following execution path is possible:
fsnotify()
[ realign the stack and store previous SP in R10 ]
<IRQ>
[ only IRET regs saved ]
common_interrupt()
interrupt_entry()
<NMI>
[ full pt_regs saved ]
...
[ unwind stack ]
When the unwinder goes through the NMI and the IRQ on the stack, and
then sees fsnotify(), it doesn't have access to the value of R10,
because it only has the five IRET registers. So the unwind stops
prematurely.
However, because the interrupt_entry() code is careful not to clobber
R10 before saving the full regs, the unwinder should be able to read R10
from the previously saved full pt_regs associated with the NMI.
Handle this case properly. When encountering an IRET regs frame
immediately after a full pt_regs frame, use the pt_regs as a backup
which can be used to get the C register values.
Also, note that a call frame resets the 'prev_regs' value, because a
function is free to clobber the registers. For this fix to work, the
IRET and full regs frames must be adjacent, with no FUNC frames in
between. So replace the FUNC hint in interrupt_entry() with an
IRET_REGS hint.
Fixes: ee9f8fce99 ("x86/unwind: Add the ORC unwinder")
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Jones <dsj@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lore.kernel.org/r/97a408167cc09f1cfa0de31a7b70dd88868d743f.1587808742.git.jpoimboe@redhat.com
The LEAQ instruction in rewind_stack_do_exit() moves the stack pointer
directly below the pt_regs at the top of the task stack before calling
do_exit(). Tell the unwinder to expect pt_regs.
Fixes: 8c1f75587a ("x86/entry/64: Add unwind hint annotations")
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Jones <dsj@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lore.kernel.org/r/68c33e17ae5963854916a46f522624f8e1d264f2.1587808742.git.jpoimboe@redhat.com
UNWIND_HINT_FUNC has some limitations: specifically, it doesn't reset
all the registers to undefined. This causes objtool to get confused
about the RBP push in __switch_to_asm(), resulting in bad ORC data.
While __switch_to_asm() does do some stack magic, it's otherwise a
normal callable-from-C function, so just annotate it as a function,
which makes objtool happy and allows it to produces the correct hints
automatically.
Fixes: 8c1f75587a ("x86/entry/64: Add unwind hint annotations")
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Jones <dsj@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lore.kernel.org/r/03d0411920d10f7418f2e909210d8e9a3b2ab081.1587808742.git.jpoimboe@redhat.com
In swapgs_restore_regs_and_return_to_usermode, after the stack is
switched to the trampoline stack, the existing UNWIND_HINT_REGS hint is
no longer valid, which can result in the following ORC unwinder warning:
WARNING: can't dereference registers at 000000003aeb0cdd for ip swapgs_restore_regs_and_return_to_usermode+0x93/0xa0
For full correctness, we could try to add complicated unwind hints so
the unwinder could continue to find the registers, but when when it's
this close to kernel exit, unwind hints aren't really needed anymore and
it's fine to just use an empty hint which tells the unwinder to stop.
For consistency, also move the UNWIND_HINT_EMPTY in
entry_SYSCALL_64_after_hwframe to a similar location.
Fixes: 3e3b9293d3 ("x86/entry/64: Return to userspace from the trampoline stack")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Reported-by: Dave Jones <dsj@fb.com>
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Joe Mario <jmario@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/60ea8f562987ed2d9ace2977502fe481c0d7c9a0.1587808742.git.jpoimboe@redhat.com
The PUSH_AND_CLEAR_REGS macro zeroes each register immediately after
pushing it. If an NMI or exception hits after a register is cleared,
but before the UNWIND_HINT_REGS annotation, the ORC unwinder will
wrongly think the previous value of the register was zero. This can
confuse the unwinding process and cause it to exit early.
Because ORC is simpler than DWARF, there are a limited number of unwind
annotation states, so it's not possible to add an individual unwind hint
after each push/clear combination. Instead, the register clearing
instructions need to be consolidated and moved to after the
UNWIND_HINT_REGS annotation.
Fixes: 3f01daecd5 ("x86/entry/64: Introduce the PUSH_AND_CLEAN_REGS macro")
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Jones <dsj@fb.com>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lore.kernel.org/r/68fd3d0bc92ae2d62ff7879d15d3684217d51f08.1587808742.git.jpoimboe@redhat.com
Treat ia32/i386 objects in array the same as 64-bit vdso objects.
Co-developed-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200420183256.660371-5-dima@arista.com
`i` and `j` are used everywhere with unsigned types.
Convert `i` to unsigned long in order to avoid signed to unsigned
comparisons. Convert `k` to unsigned int with the same purpose.
Also, drop `j` as `i` could be used in place of it.
Introduce syms_nr for readability.
Co-developed-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200420183256.660371-4-dima@arista.com
err() message in main() is misleading: it should print `outfilename`,
which is argv[3], not argv[2].
Correct error messages to be more precise about what failed and for
which file.
Co-developed-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200420183256.660371-2-dima@arista.com
Make the doublefault exception handler unconditional on 32-bit. Yes,
it is important to be able to catch #DF exceptions instead of silent
reboots. Yes, the code size increase is worth every byte. And one less
CONFIG symbol is just the cherry on top.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200404083646.8897-1-bp@alien8.de
The code, #undef CONFIG_OPTIMIZE_INLINING, is not working as expected
because <linux/compiler_types.h> is parsed before vclock_gettime.c since
28128c61e0 ("kconfig.h: Include compiler types to avoid missed struct
attributes").
Since then, <linux/compiler_types.h> is included really early by using the
'-include' option. So, you cannot negate the decision of
<linux/compiler_types.h> in this way.
You can confirm it by checking the pre-processed code, like this:
$ make arch/x86/entry/vdso/vdso32/vclock_gettime.i
There is no difference with/without CONFIG_CC_OPTIMIZE_FOR_SIZE.
It is about two years since 28128c61e0. Nobody has reported a problem
(or, nobody has even noticed the fact that this code is not working).
It is ugly and unreliable to attempt to undefine a CONFIG option from C
files, and anyway the inlining heuristic is up to the compiler.
Just remove the broken code.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: David Miller <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20200220110807.32534-1-masahiroy@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here are 3 SPDX patches for 5.7-rc1.
One fixes up the SPDX tag for a single driver, while the other two go
through the tree and add SPDX tags for all of the .gitignore files as
needed.
Nothing too complex, but you will get a merge conflict with your current
tree, that should be trivial to handle (one file modified by two things,
one file deleted.)
All 3 of these have been in linux-next for a while, with no reported
issues other than the merge conflict.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXodg5A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykySQCgy9YDrkz7nWq6v3Gohl6+lW/L+rMAnRM4uTZm
m5AuCzO3Azt9KBi7NL+L
=2Lm5
-----END PGP SIGNATURE-----
Merge tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX updates from Greg KH:
"Here are three SPDX patches for 5.7-rc1.
One fixes up the SPDX tag for a single driver, while the other two go
through the tree and add SPDX tags for all of the .gitignore files as
needed.
Nothing too complex, but you will get a merge conflict with your
current tree, that should be trivial to handle (one file modified by
two things, one file deleted.)
All three of these have been in linux-next for a while, with no
reported issues other than the merge conflict"
* tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
ASoC: MT6660: make spdxcheck.py happy
.gitignore: add SPDX License Identifier
.gitignore: remove too obvious comments
Pull x86 cleanups from Ingo Molnar:
"This topic tree contains more commits than usual:
- most of it are uaccess cleanups/reorganization by Al
- there's a bunch of prototype declaration (--Wmissing-prototypes)
cleanups
- misc other cleanups all around the map"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
x86/mm/set_memory: Fix -Wmissing-prototypes warnings
x86/efi: Add a prototype for efi_arch_mem_reserve()
x86/mm: Mark setup_emu2phys_nid() static
x86/jump_label: Move 'inline' keyword placement
x86/platform/uv: Add a missing prototype for uv_bau_message_interrupt()
kill uaccess_try()
x86: unsafe_put-style macro for sigmask
x86: x32_setup_rt_frame(): consolidate uaccess areas
x86: __setup_rt_frame(): consolidate uaccess areas
x86: __setup_frame(): consolidate uaccess areas
x86: setup_sigcontext(): list user_access_{begin,end}() into callers
x86: get rid of put_user_try in __setup_rt_frame() (both 32bit and 64bit)
x86: ia32_setup_rt_frame(): consolidate uaccess areas
x86: ia32_setup_frame(): consolidate uaccess areas
x86: ia32_setup_sigcontext(): lift user_access_{begin,end}() into the callers
x86/alternatives: Mark text_poke_loc_init() static
x86/cpu: Fix a -Wmissing-prototypes warning for init_ia32_feat_ctl()
x86/mm: Drop pud_mknotpresent()
x86: Replace setup_irq() by request_irq()
x86/configs: Slightly reduce defconfigs
...
Pull x86 build updates from Ingo Molnar:
"A handful of updates: two linker script cleanups and a stock
defconfig+allmodconfig bootability fix"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Discard .note.gnu.property sections in vDSO
x86, vmlinux.lds: Add RUNTIME_DISCARD_EXIT to generic DISCARDS
x86/Kconfig: Make CMDLINE_OVERRIDE depend on non-empty CMDLINE
- Convert the 32bit syscalls to be pt_regs based which removes the
requirement to push all 6 potential arguments onto the stack and
consolidates the interface with the 64bit variant
- The first small portion of the exception and syscall related entry
code consolidation which aims to address the recently discovered
issues vs. RCU, int3, NMI and some other exceptions which can
interrupt any context. The bulk of the changes is still work in
progress and aimed for 5.8.
- A few lockdep namespace cleanups which have been applied into this
branch to keep the prerequisites for the ongoing work confined.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6B/TMTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYA6EAC7r/bCMxBelljT3b7LkBbiJcocJ+zK
OSzWU9miJGTAvYqn4/ciLKg4dA424b/1rBFlF1hBTCQ0HL5Cv4lajxdKEZCO5WCC
WWTCz+MC60aWFaH3VNoywiLGb39H2IbqWbS9yNPd/wBkLHiMAD6NPQntOvcPaD4j
1lyrMtLzfrWlrHxvxdI3kt5ZpFLYNXr2xk61xQjTz0ROFQBhf2sDsuhHhiYVLPj7
JwYktpbBiPeaw2+I18NPymNPY+VfY8LCTgLl5M+rbKyCqebKaedZQJ7QXFhAEqKC
Y2f+gJsKWtTDzGP2mk/5kF0uP7cd0vJK35ZCXtLZ9BbcNtFZU6w+ADqRo4pJBHRY
QRzo/AWrdkuTJF0CrP6mcneNC7NwWLSdKrE1z77RQCHUPVvhHhRDZsgdLcZ/KKwx
y1ji22trwNB+7LmI2fUOU5RRHZBIuNvQT+mPt24febJuHpZKul62dd3cqTGeSTC+
MYVknYDSg/+jk+83DhuZnTyb9lWTbq/0Q1HRDu6l2LrMIH7YMPpY5Ea64ZFYzWXy
s0+iHEM4mUzltwNauHIntjbwXi3C0l2k1WQyG0gun2eS6SXfu0lb93V4msFj/N1+
oHavH2n2A4XrRr+Ob87fsl7nfXJibWP7R9xPblrWP2sNdqfjSyGd49rnsvpWqWMK
Fj0d7tQ78+/SwA==
=tWXS
-----END PGP SIGNATURE-----
Merge tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 entry code updates from Thomas Gleixner:
- Convert the 32bit syscalls to be pt_regs based which removes the
requirement to push all 6 potential arguments onto the stack and
consolidates the interface with the 64bit variant
- The first small portion of the exception and syscall related entry
code consolidation which aims to address the recently discovered
issues vs. RCU, int3, NMI and some other exceptions which can
interrupt any context. The bulk of the changes is still work in
progress and aimed for 5.8.
- A few lockdep namespace cleanups which have been applied into this
branch to keep the prerequisites for the ongoing work confined.
* tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
x86/entry: Fix build error x86 with !CONFIG_POSIX_TIMERS
lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}()
lockdep: Rename trace_softirqs_{on,off}()
lockdep: Rename trace_hardirq_{enter,exit}()
x86/entry: Rename ___preempt_schedule
x86: Remove unneeded includes
x86/entry: Drop asmlinkage from syscalls
x86/entry/32: Enable pt_regs based syscalls
x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments
x86/entry/32: Rename 32-bit specific syscalls
x86/entry/32: Clean up syscall_32.tbl
x86/entry: Remove ABI prefixes from functions in syscall tables
x86/entry/64: Add __SYSCALL_COMMON()
x86/entry: Remove syscall qualifier support
x86/entry/64: Remove ptregs qualifier from syscall table
x86/entry: Move max syscall number calculation to syscallhdr.sh
x86/entry/64: Split X32 syscall table into its own file
x86/entry/64: Move sys_ni_syscall stub to common.c
x86/entry/64: Use syscall wrappers for x32_rt_sigreturn
x86/entry: Refactor SYS_NI macros
...
With the command-line option -mx86-used-note=yes which can also be
enabled at binutils build time with:
--enable-x86-used-note generate GNU x86 used ISA and feature properties
the x86 assembler in binutils 2.32 and above generates a program property
note in a note section, .note.gnu.property, to encode used x86 ISAs and
features. But kernel linker script only contains a single NOTE segment:
PHDRS
{
text PT_LOAD FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
dynamic PT_DYNAMIC FLAGS(4); /* PF_R */
note PT_NOTE FLAGS(4); /* PF_R */
eh_frame_hdr 0x6474e550;
}
The NOTE segment generated by the vDSO linker script is aligned to 4 bytes.
But the .note.gnu.property section must be aligned to 8 bytes on x86-64:
[hjl@gnu-skx-1 vdso]$ readelf -n vdso64.so
Displaying notes found in: .note
Owner Data size Description
Linux 0x00000004 Unknown note type: (0x00000000)
description data: 06 00 00 00
readelf: Warning: note with invalid namesz and/or descsz found at offset 0x20
readelf: Warning: type: 0x78, namesize: 0x00000100, descsize: 0x756e694c, alignment: 8
Since the note.gnu.property section in the vDSO is not checked by the
dynamic linker, discard the .note.gnu.property sections in the vDSO.
[ bp: Massage. ]
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200326174314.254662-1-hjl.tools@gmail.com
Because moar '_' isn't always moar readable.
git grep -l "___preempt_schedule\(_notrace\)*" | while read file;
do
sed -ie 's/___preempt_schedule\(_notrace\)*/preempt_schedule\1_thunk/g' $file;
done
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200320115858.995685950@infradead.org
asmlinkage is no longer required since the syscall ABI is now fully under
x86 architecture control. This makes the 32-bit native syscalls a bit more
effecient by passing in regs via EAX instead of on the stack.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200313195144.164260-18-brgerst@gmail.com
Enable pt_regs based syscalls for 32-bit. This makes the 32-bit native
kernel consistent with the 64-bit kernel, and improves the syscall
interface by not needing to push all 6 potential arguments onto the stack.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-17-brgerst@gmail.com
For the 32-bit syscall interface, 64-bit arguments (loff_t) are passed via
a pair of 32-bit registers. These register pairs end up in consecutive stack
slots, which matches the C ABI for 64-bit arguments. But when accessing the
registers directly from pt_regs, the wrapper needs to manually reassemble the
64-bit value. These wrappers already exist for 32-bit compat, so make them
available to 32-bit native in preparation for enabling pt_regs-based syscalls.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-16-brgerst@gmail.com
Rename the syscalls that only exist for 32-bit from x86_* to ia32_* to make it
clear they are for 32-bit only. Also rename the functions to match the syscall
name.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-15-brgerst@gmail.com
After removal of the __ia32_ prefix, remove compat entries that are now
identical to the native entry.
Converted with this script and fixing up whitespace:
while read nr abi name entry compat; do
if [ "${nr:0:1}" = "#" ]; then
echo $nr $abi $name $entry $compat
continue
fi
if [ "$entry" = "$compat" ]; then
compat=""
fi
echo "$nr $abi $name $entry $compat"
done
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-14-brgerst@gmail.com
Move the ABI prefixes to the __SYSCALL_[abi]() macros. This allows removal
of the need to strip the prefix for UML.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-13-brgerst@gmail.com
Syscall qualifier support is no longer needed.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-11-brgerst@gmail.com
Now that the fast syscall path is removed, the ptregs qualifier is unused.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-10-brgerst@gmail.com
Instead of using an array in asm-offsets to calculate the max syscall
number, calculate it when writing out the syscall headers.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-9-brgerst@gmail.com
Since X32 has its own syscall table now, move it to a separate file.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Link: https://lkml.kernel.org/r/20200313195144.164260-8-brgerst@gmail.com
so it can be available to multiple syscall tables. Also directly return
-ENOSYS instead of bouncing to the generic sys_ni_syscall().
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200313195144.164260-7-brgerst@gmail.com
User space cannot disable interrupts any longer so trace return to user space
unconditionally as IRQS_ON.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200308222609.314596327@linutronix.de
Nothing cares about the -1 "mark as interrupt" in the errorcode of
exception entries. It's only used to fill the error code when a signal is
delivered, but this is already inconsistent vs. 64 bit as there all
exceptions which do not have an error code set it to 0. So if 32 bit
applications would care about this, then they would have noticed more than
a decade ago.
Just use 0 for all excpetions which do not have an errorcode consistently.
This does neither break /proc/$PID/syscall because this interface examines
the error code / syscall number which is on the stack and that is set to -1
(no syscall) in common_exception unconditionally for all exceptions. The
push in the entry stub is just there to fill the hardware error code slot
on the stack for consistency of the stack layout.
A transient observation of 0 is possible, but that's true for the other
exceptions which use 0 already as well and that interface is an unreliable
snapshot of dubious correctness anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/87mu94m7ky.fsf@nanos.tec.linutronix.de