linux_dsm_epyc7002/arch/x86/include/asm/fpu/api.h
Rik van Riel 5f409e20b7 x86/fpu: Defer FPU state load until return to userspace
Defer loading of FPU state until return to userspace. This gives
the kernel the potential to skip loading FPU state for tasks that
stay in kernel mode, or for tasks that end up with repeated
invocations of kernel_fpu_begin() & kernel_fpu_end().

The fpregs_lock/unlock() section ensures that the registers remain
unchanged. Otherwise a context switch or a bottom half could save the
registers to its FPU context and the processor's FPU registers would
became random if modified at the same time.

KVM swaps the host/guest registers on entry/exit path. This flow has
been kept as is. First it ensures that the registers are loaded and then
saves the current (host) state before it loads the guest's registers. The
swap is done at the very end with disabled interrupts so it should not
change anymore before theg guest is entered. The read/save version seems
to be cheaper compared to memcpy() in a micro benchmark.

Each thread gets TIF_NEED_FPU_LOAD set as part of fork() / fpu__copy().
For kernel threads, this flag gets never cleared which avoids saving /
restoring the FPU state for kernel threads and during in-kernel usage of
the FPU registers.

 [
   bp: Correct and update commit message and fix checkpatch warnings.
   s/register/registers/ where it is used in plural.
   minor comment corrections.
   remove unused trace_x86_fpu_activate_state() TP.
 ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Waiman Long <longman@redhat.com>
Cc: x86-ml <x86@kernel.org>
Cc: Yi Wang <wang.yi59@zte.com.cn>
Link: https://lkml.kernel.org/r/20190403164156.19645-24-bigeasy@linutronix.de
2019-04-12 19:34:47 +02:00

66 lines
1.7 KiB
C

/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (C) 1994 Linus Torvalds
*
* Pentium III FXSR, SSE support
* General FPU state handling cleanups
* Gareth Hughes <gareth@valinux.com>, May 2000
* x86-64 work by Andi Kleen 2002
*/
#ifndef _ASM_X86_FPU_API_H
#define _ASM_X86_FPU_API_H
#include <linux/bottom_half.h>
/*
* Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It
* disables preemption so be careful if you intend to use it for long periods
* of time.
* If you intend to use the FPU in softirq you need to check first with
* irq_fpu_usable() if it is possible.
*/
extern void kernel_fpu_begin(void);
extern void kernel_fpu_end(void);
extern bool irq_fpu_usable(void);
extern void fpregs_mark_activate(void);
/*
* Use fpregs_lock() while editing CPU's FPU registers or fpu->state.
* A context switch will (and softirq might) save CPU's FPU registers to
* fpu->state and set TIF_NEED_FPU_LOAD leaving CPU's FPU registers in
* a random state.
*/
static inline void fpregs_lock(void)
{
preempt_disable();
local_bh_disable();
}
static inline void fpregs_unlock(void)
{
local_bh_enable();
preempt_enable();
}
#ifdef CONFIG_X86_DEBUG_FPU
extern void fpregs_assert_state_consistent(void);
#else
static inline void fpregs_assert_state_consistent(void) { }
#endif
/*
* Load the task FPU state before returning to userspace.
*/
extern void switch_fpu_return(void);
/*
* Query the presence of one or more xfeatures. Works on any legacy CPU as well.
*
* If 'feature_name' is set then put a human-readable description of
* the feature there as well - this can be used to print error (or success)
* messages.
*/
extern int cpu_has_xfeatures(u64 xfeatures_mask, const char **feature_name);
#endif /* _ASM_X86_FPU_API_H */