x86/fpu: Restore regs in copy_fpstate_to_sigframe() in order to use the fastpath

If a task is scheduled out and receives a signal then it won't be
able to take the fastpath because the registers aren't available. The
slowpath is more expensive compared to XRSTOR + XSAVE which usually
succeeds.

Here are some clock_gettime() numbers from a bigger box with AVX512
during bootup:

- __fpregs_load_activate() takes 140ns - 350ns. If it was the most recent
  FPU context on the CPU then the optimisation in __fpregs_load_activate()
  will skip the load (which was disabled during the test).

- copy_fpregs_to_sigframe() takes 200ns - 450ns if it succeeds. On a
  pagefault it is 1.8us - 3us usually in the 2.6us area.

- The slowpath takes 1.5us - 6us. Usually in the 2.6us area.

My testcases (including lat_sig) take the fastpath without
__fpregs_load_activate(). I expect this to be the majority.

Since the slowpath is in the >1us area it makes sense to load the
registers and attempt to save them directly. The direct save may fail
but should only happen on the first invocation or after fork() while the
page is read-only.

 [ bp: Massage a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-27-bigeasy@linutronix.de
This commit is contained in:
Sebastian Andrzej Siewior 2019-04-12 20:16:15 +02:00 committed by Borislav Petkov
parent da2f32fb8d
commit 06b251dff7

View File

@ -175,20 +175,21 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
(struct _fpstate_32 __user *) buf) ? -1 : 1;
/*
* If we do not need to load the FPU registers at return to userspace
* then the CPU has the current state. Try to save it directly to
* userland's stack frame if it does not cause a pagefault. If it does,
* try the slowpath.
* Load the FPU registers if they are not valid for the current task.
* With a valid FPU state we can attempt to save the state directly to
* userland's stack frame which will likely succeed. If it does not, do
* the slowpath.
*/
fpregs_lock();
if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
pagefault_disable();
ret = copy_fpregs_to_sigframe(buf_fx);
pagefault_enable();
if (ret)
copy_fpregs_to_fpstate(fpu);
set_thread_flag(TIF_NEED_FPU_LOAD);
}
if (test_thread_flag(TIF_NEED_FPU_LOAD))
__fpregs_load_activate();
pagefault_disable();
ret = copy_fpregs_to_sigframe(buf_fx);
pagefault_enable();
if (ret && !test_thread_flag(TIF_NEED_FPU_LOAD))
copy_fpregs_to_fpstate(fpu);
set_thread_flag(TIF_NEED_FPU_LOAD);
fpregs_unlock();
if (ret) {