linux_dsm_epyc7002/arch/x86
Yonghong Song e7ed9d9bd0 uprobes/x86: Emulate push insns for uprobe on x86
Uprobe is a tracing mechanism for userspace programs.
Typical uprobe will incur overhead of two traps.
First trap is caused by replaced trap insn, and
the second trap is to execute the original displaced
insn in user space.

To reduce the overhead, kernel provides hooks
for architectures to emulate the original insn
and skip the second trap. In x86, emulation
is done for certain branch insns.

This patch extends the emulation to "push <reg>"
insns. These insns are typical in the beginning
of the function. For example, bcc
in https://github.com/iovisor/bcc repo provides
tools to measure funclantency, detect memleak, etc.
The tools will place uprobes in the beginning of
function and possibly uretprobes at the end of function.
This patch is able to reduce the trap overhead for
uprobe from 2 to 1.

Without this patch, uretprobe will typically incur
three traps. With this patch, if the function starts
with "push" insn, the number of traps can be
reduced from 3 to 2.

An experiment was conducted on two local VMs,
fedora 26 64-bit VM and 32-bit VM, both 4 processors
and 4GB memory, booted with latest tip repo (and this patch).
The host is MacBook with intel i7 processor.

The test program looks like:

  #include <stdio.h>
  #include <stdlib.h>
  #include <time.h>
  #include <sys/time.h>

  static void test() __attribute__((noinline));
  void test() {}
  int main() {
    struct timeval start, end;

    gettimeofday(&start, NULL);
    for (int i = 0; i < 1000000; i++) {
      test();
    }
    gettimeofday(&end, NULL);

    printf("%ld\n", ((end.tv_sec * 1000000 + end.tv_usec)
                     - (start.tv_sec * 1000000 + start.tv_usec)));
    return 0;
  }

The program is compiled without optimization, and
the first insn for function "test" is "push %rbp".
The host is relatively idle.

Before the test run, the uprobe is inserted as below for uprobe:
  echo 'p <binary>:<test_func_offset>' > /sys/kernel/debug/tracing/uprobe_events
  echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable
and for uretprobe:
  echo 'r <binary>:<test_func_offset>' > /sys/kernel/debug/tracing/uprobe_events
  echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable

Unit: microsecond(usec) per loop iteration

x86_64          W/ this patch   W/O this patch
uprobe          1.55            3.1
uretprobe       2.0             3.6

x86_32          W/ this patch   W/O this patch
uprobe          1.41            3.5
uretprobe       1.75            4.0

You can see that this patch significantly reduced the overhead,
50% for uprobe and 44% for uretprobe on x86_64, and even more
on x86_32.

Signed-off-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/20171201001202.3706564-1-yhs@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11 18:42:11 +01:00
..
boot x86/boot/KASLR: Remove unused variable 2017-11-23 20:17:59 +01:00
configs x86/unwind: Rename unwinder config options to 'CONFIG_UNWINDER_*' 2017-10-14 10:12:12 +02:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2017-11-14 10:52:09 -08:00
entry Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-11-26 14:11:54 -08:00
events Merge branch 'linus' into perf/urgent, to pick up dependent commits 2017-11-29 07:11:24 +01:00
hyperv Char/Misc patches for 4.15-rc1 2017-11-16 09:10:59 -08:00
ia32 License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
include uprobes/x86: Emulate push insns for uprobe on x86 2017-12-11 18:42:11 +01:00
kernel uprobes/x86: Emulate push insns for uprobe on x86 2017-12-11 18:42:11 +01:00
kvm KVM: Let KVM_SET_SIGNAL_MASK work as advertised 2017-11-27 17:53:47 +01:00
lib x86/decoder: Add new TEST instruction pattern 2017-11-24 08:36:12 +01:00
math-emu License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mm x86: don't hash faulting address in oops printout 2017-12-05 17:59:29 -08:00
net bpf: fix bpf_tail_call() x64 JIT 2017-10-03 16:04:44 -07:00
oprofile Modules updates for v4.15 2017-11-15 13:46:33 -08:00
pci pci-v4.15-changes 2017-11-15 15:01:28 -08:00
platform kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK 2017-11-15 18:21:04 -08:00
power License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
purgatory License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
ras License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
realmode x86/realmode: Don't decrypt trampoline area under SEV 2017-11-07 15:35:55 +01:00
tools License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
um Merge branch 'linus' into x86/asm, to pick up fixes and resolve conflicts 2017-11-07 10:53:06 +01:00
video
xen xen: features and fixes for v4.15-rc1 2017-11-16 13:06:27 -08:00
.gitignore
Kbuild Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-07 09:25:15 -07:00
Kconfig Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-11-26 14:11:54 -08:00
Kconfig.cpu License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kconfig.debug Merge branch 'linus' into x86/asm, to pick up fixes and resolve conflicts 2017-11-07 10:53:06 +01:00
Makefile kmemcheck: remove annotations 2017-11-15 18:21:04 -08:00
Makefile_32.cpu License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Makefile.um License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00