Commit Graph

355 Commits

Author SHA1 Message Date
Yonghong Song
69a0f9ecef bpf, bpftool: fix a few ubsan warnings
The issue is reported at https://github.com/libbpf/libbpf/issues/28.

Basically, per C standard, for
  void *memcpy(void *dest, const void *src, size_t n)
if "dest" or "src" is NULL, regardless of whether "n" is 0 or not,
the result of memcpy is undefined. clang ubsan reported three such
instances in bpf.c with the following pattern:
  memcpy(dest, 0, 0).

Although in practice, no known compiler will cause issues when
copy size is 0. Let us still fix the issue to silence ubsan
warnings.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-10 09:46:51 +02:00
Daniel Borkmann
1713d68b3b bpf, libbpf: add support for BTF Var and DataSec
This adds libbpf support for BTF Var and DataSec kinds. Main point
here is that libbpf needs to do some preparatory work before the
whole BTF object can be loaded into the kernel, that is, fixing up
of DataSec size taken from the ELF section size and non-static
variable offset which needs to be taken from the ELF's string section.

Upstream LLVM doesn't fix these up since at time of BTF emission
it is too early in the compilation process thus this information
isn't available yet, hence loader needs to take care of it.

Note, deduplication handling has not been in the scope of this work
and needs to be addressed in a future commit.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://reviews.llvm.org/D59441
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:47 -07:00
Daniel Borkmann
d859900c4c bpf, libbpf: support global data/bss/rodata sections
This work adds BPF loader support for global data sections
to libbpf. This allows to write BPF programs in more natural
C-like way by being able to define global variables and const
data.

Back at LPC 2018 [0] we presented a first prototype which
implemented support for global data sections by extending BPF
syscall where union bpf_attr would get additional memory/size
pair for each section passed during prog load in order to later
add this base address into the ldimm64 instruction along with
the user provided offset when accessing a variable. Consensus
from LPC was that for proper upstream support, it would be
more desirable to use maps instead of bpf_attr extension as
this would allow for introspection of these sections as well
as potential live updates of their content. This work follows
this path by taking the following steps from loader side:

 1) In bpf_object__elf_collect() step we pick up ".data",
    ".rodata", and ".bss" section information.

 2) If present, in bpf_object__init_internal_map() we add
    maps to the obj's map array that corresponds to each
    of the present sections. Given section size and access
    properties can differ, a single entry array map is
    created with value size that is corresponding to the
    ELF section size of .data, .bss or .rodata. These
    internal maps are integrated into the normal map
    handling of libbpf such that when user traverses all
    obj maps, they can be differentiated from user-created
    ones via bpf_map__is_internal(). In later steps when
    we actually create these maps in the kernel via
    bpf_object__create_maps(), then for .data and .rodata
    sections their content is copied into the map through
    bpf_map_update_elem(). For .bss this is not necessary
    since array map is already zero-initialized by default.
    Additionally, for .rodata the map is frozen as read-only
    after setup, such that neither from program nor syscall
    side writes would be possible.

 3) In bpf_program__collect_reloc() step, we record the
    corresponding map, insn index, and relocation type for
    the global data.

 4) And last but not least in the actual relocation step in
    bpf_program__relocate(), we mark the ldimm64 instruction
    with src_reg = BPF_PSEUDO_MAP_VALUE where in the first
    imm field the map's file descriptor is stored as similarly
    done as in BPF_PSEUDO_MAP_FD, and in the second imm field
    (as ldimm64 is 2-insn wide) we store the access offset
    into the section. Given these maps have only single element
    ldimm64's off remains zero in both parts.

 5) On kernel side, this special marked BPF_PSEUDO_MAP_VALUE
    load will then store the actual target address in order
    to have a 'map-lookup'-free access. That is, the actual
    map value base address + offset. The destination register
    in the verifier will then be marked as PTR_TO_MAP_VALUE,
    containing the fixed offset as reg->off and backing BPF
    map as reg->map_ptr. Meaning, it's treated as any other
    normal map value from verification side, only with
    efficient, direct value access instead of actual call to
    map lookup helper as in the typical case.

Currently, only support for static global variables has been
added, and libbpf rejects non-static global variables from
loading. This can be lifted until we have proper semantics
for how BPF will treat multi-object BPF loads. From BTF side,
libbpf will set the value type id of the types corresponding
to the ".bss", ".data" and ".rodata" names which LLVM will
emit without the object name prefix. The key type will be
left as zero, thus making use of the key-less BTF option in
array maps.

Simple example dump of program using globals vars in each
section:

  # bpftool prog
  [...]
  6784: sched_cls  name load_static_dat  tag a7e1291567277844  gpl
        loaded_at 2019-03-11T15:39:34+0000  uid 0
        xlated 1776B  jited 993B  memlock 4096B  map_ids 2238,2237,2235,2236,2239,2240

  # bpftool map show id 2237
  2237: array  name test_glo.bss  flags 0x0
        key 4B  value 64B  max_entries 1  memlock 4096B
  # bpftool map show id 2235
  2235: array  name test_glo.data  flags 0x0
        key 4B  value 64B  max_entries 1  memlock 4096B
  # bpftool map show id 2236
  2236: array  name test_glo.rodata  flags 0x80
        key 4B  value 96B  max_entries 1  memlock 4096B

  # bpftool prog dump xlated id 6784
  int load_static_data(struct __sk_buff * skb):
  ; int load_static_data(struct __sk_buff *skb)
     0: (b7) r6 = 0
  ; test_reloc(number, 0, &num0);
     1: (63) *(u32 *)(r10 -4) = r6
     2: (bf) r2 = r10
  ; int load_static_data(struct __sk_buff *skb)
     3: (07) r2 += -4
  ; test_reloc(number, 0, &num0);
     4: (18) r1 = map[id:2238]
     6: (18) r3 = map[id:2237][0]+0    <-- direct addr in .bss area
     8: (b7) r4 = 0
     9: (85) call array_map_update_elem#100464
    10: (b7) r1 = 1
  ; test_reloc(number, 1, &num1);
  [...]
  ; test_reloc(string, 2, str2);
   120: (18) r8 = map[id:2237][0]+16   <-- same here at offset +16
   122: (18) r1 = map[id:2239]
   124: (18) r3 = map[id:2237][0]+16
   126: (b7) r4 = 0
   127: (85) call array_map_update_elem#100464
   128: (b7) r1 = 120
  ; str1[5] = 'x';
   129: (73) *(u8 *)(r9 +5) = r1
  ; test_reloc(string, 3, str1);
   130: (b7) r1 = 3
   131: (63) *(u32 *)(r10 -4) = r1
   132: (b7) r9 = 3
   133: (bf) r2 = r10
  ; int load_static_data(struct __sk_buff *skb)
   134: (07) r2 += -4
  ; test_reloc(string, 3, str1);
   135: (18) r1 = map[id:2239]
   137: (18) r3 = map[id:2235][0]+16   <-- direct addr in .data area
   139: (b7) r4 = 0
   140: (85) call array_map_update_elem#100464
   141: (b7) r1 = 111
  ; __builtin_memcpy(&str2[2], "hello", sizeof("hello"));
   142: (73) *(u8 *)(r8 +6) = r1       <-- further access based on .bss data
   143: (b7) r1 = 108
   144: (73) *(u8 *)(r8 +5) = r1
  [...]

For Cilium use-case in particular, this enables migrating configuration
constants from Cilium daemon's generated header defines into global
data sections such that expensive runtime recompilations with LLVM can
be avoided altogether. Instead, the ELF file becomes effectively a
"template", meaning, it is compiled only once (!) and the Cilium daemon
will then rewrite relevant configuration data from the ELF's .data or
.rodata sections directly instead of recompiling the program. The
updated ELF is then loaded into the kernel and atomically replaces
the existing program in the networking datapath. More info in [0].

Based upon recent fix in LLVM, commit c0db6b6bd444 ("[BPF] Don't fail
for static variables").

  [0] LPC 2018, BPF track, "ELF relocation for static data in BPF",
      http://vger.kernel.org/lpc-bpf2018.html#session-3

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:47 -07:00
Joe Stringer
f8c7a4d4dc bpf, libbpf: refactor relocation handling
Adjust the code for relocations slightly with no functional changes,
so that upcoming patches that will introduce support for relocations
into the .data, .rodata and .bss sections can be added independent
of these changes.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-09 17:05:47 -07:00
Andrey Ignatov
ff466b5805 libbpf: Ignore -Wformat-nonliteral warning
vsprintf() in __base_pr() uses nonliteral format string and it breaks
compilation for those who provide corresponding extra CFLAGS, e.g.:
https://github.com/libbpf/libbpf/issues/27

If libbpf is built with the flags from PR:

  libbpf.c:68:26: error: format string is not a string literal
  [-Werror,-Wformat-nonliteral]
          return vfprintf(stderr, format, args);
                                  ^~~~~~
  1 error generated.

Ignore this warning since the use case in libbpf.c is legit.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-06 23:13:54 -07:00
Alexei Starovoitov
da11b41758 libbpf: teach libbpf about log_level bit 2
Allow bpf_prog_load_xattr() to specify log_level for program loading.

Teach libbpf to accept log_level with bit 2 set.

Increase default BPF_LOG_BUF_SIZE from 256k to 16M.
There is no downside to increase it to a maximum allowed by old kernels.
Existing 256k limit caused ENOSPC errors and users were not able to see
verifier error which is printed at the end of the verifier log.

If ENOSPC is hit, double the verifier log and try again to capture
the verifier error.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-04 01:27:38 +02:00
David S. Miller
22bdf7d459 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-03-29

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Bug fix in BTF deduplication that was mishandling an equivalence
   comparison, from Andrii.

2) libbpf Makefile fixes to properly link against libelf for the shared
   object and to actually export AF_XDP's xsk.h header, from Björn.

3) Fix use after free in bpf inode eviction, from Daniel.

4) Fix a bug in skb creation out of cpumap redirect, from Jesper.

5) Remove an unnecessary and triggerable WARN_ONCE() in max number
   of call stack frames checking in verifier, from Paul.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-29 21:00:28 -07:00
Luca Boccassi
dd399ac9e3 tools/bpf: generate pkg-config file for libbpf
Generate a libbpf.pc file at build time so that users can rely
on pkg-config to find the library, its CFLAGS and LDFLAGS.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-28 17:06:03 +01:00
Daniel Borkmann
8543e43780 bpf, libbpf: fix quiet install_headers
Both btf.h and xsk.h headers are not installed quietly due to
missing '\' for the call to QUIET_INSTALL. Lets fix it.

Before:

  # make install_headers
    INSTALL  headers
  if [ ! -d '''/usr/local/include/bpf' ]; then install -d -m 755 '''/usr/local/include/bpf'; fi; install btf.h -m 644 '''/usr/local/include/bpf';
  if [ ! -d '''/usr/local/include/bpf' ]; then install -d -m 755 '''/usr/local/include/bpf'; fi; install xsk.h -m 644 '''/usr/local/include/bpf';
  # ls /usr/local/include/bpf/
  bpf.h  btf.h  libbpf.h  xsk.h

After:

  # make install_headers
    INSTALL  headers
  # ls /usr/local/include/bpf/
  bpf.h  btf.h  libbpf.h  xsk.h

Fixes: a493f5f9d8 ("libbpf: Install btf.h with libbpf")
Fixes: 379e2014c9 ("libbpf: add xsk.h to install_headers target")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
2019-03-28 17:01:37 +01:00
Björn Töpel
89dedaef49 libbpf: add libelf dependency to shared library build
The DPDK project is moving forward with its AF_XDP PMD, and during
that process some libbpf issues surfaced [1]: When libbpf was built
as a shared library, libelf was not included in the linking phase.
Since libelf is an internal depedency to libbpf, libelf should be
included. This patch adds '-lelf' to resolve that.

  [1] https://patches.dpdk.org/patch/50704/#93571

Fixes: 1b76c13e4b ("bpf tools: Introduce 'bpf' library and add bpf feature check")
Suggested-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-28 16:24:52 +01:00
Björn Töpel
379e2014c9 libbpf: add xsk.h to install_headers target
The xsk.h header file was missing from the install_headers target in
the Makefile. This patch simply adds xsk.h to the set of installed
headers.

Fixes: 1cad078842 ("libbpf: add support for using AF_XDP sockets")
Reported-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-28 16:24:21 +01:00
Linus Torvalds
1a9df9e29c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Fixes here and there, a couple new device IDs, as usual:

   1) Fix BQL race in dpaa2-eth driver, from Ioana Ciornei.

   2) Fix 64-bit division in iwlwifi, from Arnd Bergmann.

   3) Fix documentation for some eBPF helpers, from Quentin Monnet.

   4) Some UAPI bpf header sync with tools, also from Quentin Monnet.

   5) Set descriptor ownership bit at the right time for jumbo frames in
      stmmac driver, from Aaro Koskinen.

   6) Set IFF_UP properly in tun driver, from Eric Dumazet.

   7) Fix load/store doubleword instruction generation in powerpc eBPF
      JIT, from Naveen N. Rao.

   8) nla_nest_start() return value checks all over, from Kangjie Lu.

   9) Fix asoc_id handling in SCTP after the SCTP_*_ASSOC changes this
      merge window. From Marcelo Ricardo Leitner and Xin Long.

  10) Fix memory corruption with large MTUs in stmmac, from Aaro
      Koskinen.

  11) Do not use ipv4 header for ipv6 flows in TCP and DCCP, from Eric
      Dumazet.

  12) Fix topology subscription cancellation in tipc, from Erik Hugne.

  13) Memory leak in genetlink error path, from Yue Haibing.

  14) Valid control actions properly in packet scheduler, from Davide
      Caratti.

  15) Even if we get EEXIST, we still need to rehash if a shrink was
      delayed. From Herbert Xu.

  16) Fix interrupt mask handling in interrupt handler of r8169, from
      Heiner Kallweit.

  17) Fix leak in ehea driver, from Wen Yang"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (168 commits)
  dpaa2-eth: fix race condition with bql frame accounting
  chelsio: use BUG() instead of BUG_ON(1)
  net: devlink: skip info_get op call if it is not defined in dumpit
  net: phy: bcm54xx: Encode link speed and activity into LEDs
  tipc: change to check tipc_own_id to return in tipc_net_stop
  net: usb: aqc111: Extend HWID table by QNAP device
  net: sched: Kconfig: update reference link for PIE
  net: dsa: qca8k: extend slave-bus implementations
  net: dsa: qca8k: remove leftover phy accessors
  dt-bindings: net: dsa: qca8k: support internal mdio-bus
  dt-bindings: net: dsa: qca8k: fix example
  net: phy: don't clear BMCR in genphy_soft_reset
  bpf, libbpf: clarify bump in libbpf version info
  bpf, libbpf: fix version info and add it to shared object
  rxrpc: avoid clang -Wuninitialized warning
  tipc: tipc clang warning
  net: sched: fix cleanup NULL pointer exception in act_mirr
  r8169: fix cable re-plugging issue
  net: ethernet: ti: fix possible object reference leak
  net: ibm: fix possible object reference leak
  ...
2019-03-27 12:22:57 -07:00
Andrii Nakryiko
9ec71c1cdb libbpf: fix btf_dedup equivalence check handling of different kinds
btf_dedup_is_equiv() used to compare btf_type->info fields, before doing
kind-specific equivalence check. This comparsion implicitly verified
that candidate and canonical types are of the same kind. With enum fwd
resolution logic this check couldn't be done generically anymore, as for
enums info contains vlen, which differs between enum fwd and
fully-defined enum, so this check was subsumed by kind-specific
equivalence checks.

This change caused btf_dedup_is_equiv() to let through VOID vs other
types check to reach switch, which was never meant to be handing VOID
kind, as VOID kind is always pre-resolved to itself and is only
equivalent to itself, which is checked early in btf_dedup_is_equiv().

This change adds back BTF kind equality check in place of more generic
btf_type->info check, still defering further kind-specific checks to
a per-kind switch.

Fixes: 9768095ba9 ("btf: resolve enum fwds in btf_dedup")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-27 08:01:24 -07:00
Daniel Borkmann
63197f78bc bpf, libbpf: clarify bump in libbpf version info
The current documentation suggests that we would need to bump the
libbpf version on every change. Lets clarify this a bit more and
reflect what we do today in practice, that is, bumping it once per
development cycle.

Fixes: 76d1b894c5 ("libbpf: Document API and ABI conventions")
Reported-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-24 19:49:04 -07:00
Daniel Borkmann
1d382264d9 bpf, libbpf: fix version info and add it to shared object
Even though libbpf's versioning script for the linker (libbpf.map)
is pointing to 0.0.2, the BPF_EXTRAVERSION in the Makefile has
not been updated along with it and is therefore still on 0.0.1.

While fixing up, I also noticed that the generated shared object
versioning information is missing, typical convention is to have
a linker name (libbpf.so), soname (libbpf.so.0) and real name
(libbpf.so.0.0.2) for library management. This is based upon the
LIBBPF_VERSION as well.

The build will then produce the following bpf libraries:

  # ll libbpf*
  libbpf.a
  libbpf.so -> libbpf.so.0.0.2
  libbpf.so.0 -> libbpf.so.0.0.2
  libbpf.so.0.0.2

  # readelf -d libbpf.so.0.0.2 | grep SONAME
  0x000000000000000e (SONAME)             Library soname: [libbpf.so.0]

And install them accordingly:

  # rm -rf /tmp/bld; mkdir /tmp/bld; make -j$(nproc) O=/tmp/bld install

  Auto-detecting system features:
  ...                        libelf: [ on  ]
  ...                           bpf: [ on  ]

    CC       /tmp/bld/libbpf.o
    CC       /tmp/bld/bpf.o
    CC       /tmp/bld/nlattr.o
    CC       /tmp/bld/btf.o
    CC       /tmp/bld/libbpf_errno.o
    CC       /tmp/bld/str_error.o
    CC       /tmp/bld/netlink.o
    CC       /tmp/bld/bpf_prog_linfo.o
    CC       /tmp/bld/libbpf_probes.o
    CC       /tmp/bld/xsk.o
    LD       /tmp/bld/libbpf-in.o
    LINK     /tmp/bld/libbpf.a
    LINK     /tmp/bld/libbpf.so.0.0.2
    LINK     /tmp/bld/test_libbpf
    INSTALL  /tmp/bld/libbpf.a
    INSTALL  /tmp/bld/libbpf.so.0.0.2

  # ll /usr/local/lib64/libbpf.*
  /usr/local/lib64/libbpf.a
  /usr/local/lib64/libbpf.so -> libbpf.so.0.0.2
  /usr/local/lib64/libbpf.so.0 -> libbpf.so.0.0.2
  /usr/local/lib64/libbpf.so.0.0.2

Fixes: 1bf4b05810 ("tools: bpftool: add probes for eBPF program types")
Fixes: 1b76c13e4b ("bpf tools: Introduce 'bpf' library and add bpf feature check")
Reported-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-24 19:49:04 -07:00
Thomas Gleixner
d8b5297f6d perf/core improvements and fixes:
BPF:
 
   Song Liu:
 
   - Add support for annotating BPF programs, using the PERF_RECORD_BPF_EVENT
     and PERF_RECORD_KSYMBOL recently added to the kernel and plugging
     binutils's libopcodes disassembly of BPF programs with the existing
     annotation interfaces in 'perf annotate', 'perf report' and 'perf top'
     various output formats (--stdio, --stdio2, --tui).
 
 perf list:
 
   Andi Kleen:
 
   - Filter metrics when using substring search.
 
 perf record:
 
   Andi Kleen:
 
   - Allow to limit number of reported perf.data files
 
   - Clarify help for --switch-output.
 
 perf report:
 
   Andi Kleen
 
   - Indicate JITed code better.
 
   - Show all sort keys in help output.
 
 perf script:
 
   Andi Kleen:
 
   - Support relative time.
 
 perf stat:
 
   Andi Kleen:
 
   - Improve scaling.
 
 General:
 
   Changbin Du:
 
   - Fix some mostly error path memory and reference count leaks found
     using gcc's ASan and UBSan.
 
 Vendor events:
 
   Mamatha Inamdar:
 
   - Remove P8 HW events which are not supported.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXJOmigAKCRCyPKLppCJ+
 J+EPAQDNzH1M3uJ6cOhyzAMowpsl0Dgs0Q+5iNlOnDYVr2RfhgEA2Sr2fQyl/qiG
 h6jRbzvdE+PTXbcMNO79ajmufAHdLgQ=
 =DuTU
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.1-20190321' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/core improvements and fixes from Arnaldo:

BPF:

  Song Liu:

  - Add support for annotating BPF programs, using the PERF_RECORD_BPF_EVENT
    and PERF_RECORD_KSYMBOL recently added to the kernel and plugging
    binutils's libopcodes disassembly of BPF programs with the existing
    annotation interfaces in 'perf annotate', 'perf report' and 'perf top'
    various output formats (--stdio, --stdio2, --tui).

perf list:

  Andi Kleen:

  - Filter metrics when using substring search.

perf record:

  Andi Kleen:

  - Allow to limit number of reported perf.data files

  - Clarify help for --switch-output.

perf report:

  Andi Kleen

  - Indicate JITed code better.

  - Show all sort keys in help output.

perf script:

  Andi Kleen:

  - Support relative time.

perf stat:

  Andi Kleen:

  - Improve scaling.

General:

  Changbin Du:

  - Fix some mostly error path memory and reference count leaks found
    using gcc's ASan and UBSan.

Vendor events:

  Mamatha Inamdar:

  - Remove P8 HW events which are not supported.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-03-22 22:51:21 +01:00
Thomas Gleixner
4a98be8293 perf/core improvements and fixes:
kernel:
 
   Stephane Eranian :
 
   - Restore mmap record type correctly when handling PERF_RECORD_MMAP2
     events, as the same template is used for all the threads interested
     in mmap events, some may want just PERF_RECORD_MMAP, while some
     may want the extra info in MMAP2 records.
 
 perf probe:
 
   Adrian Hunter:
 
   - Fix getting the kernel map, because since changes related to x86 PTI
     entry trampolines handling, there are more than one kernel map.
 
 perf script:
 
   Andi Kleen:
 
   - Support insn output for normal samples, i.e.:
 
     perf script -F ip,sym,insn --xed
 
     Will fetch the sample IP from the thread address space and feed it
     to Intel's XED disassembler, producing lines such as:
 
       ffffffffa4068804 native_write_msr            wrmsr
       ffffffffa415b95e __hrtimer_next_event_base   movq  0x18(%rax), %rdx
 
     That match 'perf annotate's output.
 
   - Make the --cpu filter apply to  PERF_RECORD_COMM/FORK/... events, in
     addition to PERF_RECORD_SAMPLE.
 
 perf report:
 
   - Add a new --samples option to save a small random number of samples
     per hist entry, using a reservoir technique to select a representative
     number of samples.
 
     Then allow browsing the samples using 'perf script' as part of the hist
     entry context menu. This automatically adds the right filters, so only
     the thread or CPU of the sample is displayed. Then we use less' search
     functionality to directly jump to the time stamp of the selected sample.
 
     It uses different menus for assembler and source display.  Assembler
     needs xed installed and source needs debuginfo.
 
   - Fix the UI browser scripts pop up menu when there are many scripts
     available.
 
 perf report:
 
   Andi Kleen:
 
   - Add 'time' sort option. E.g.:
 
     % perf report --sort time,overhead,symbol --time-quantum 1ms --stdio
     ...
          0.67%  277061.87300  [.] _dl_start
          0.50%  277061.87300  [.] f1
          0.50%  277061.87300  [.] f2
          0.33%  277061.87300  [.] main
          0.29%  277061.87300  [.] _dl_lookup_symbol_x
          0.29%  277061.87300  [.] dl_main
          0.29%  277061.87300  [.] do_lookup_x
          0.17%  277061.87300  [.] _dl_debug_initialize
          0.17%  277061.87300  [.] _dl_init_paths
          0.08%  277061.87300  [.] check_match
          0.04%  277061.87300  [.] _dl_count_modids
          1.33%  277061.87400  [.] f1
          1.33%  277061.87400  [.] f2
          1.33%  277061.87400  [.] main
          1.17%  277061.87500  [.] main
          1.08%  277061.87500  [.] f1
          1.08%  277061.87500  [.] f2
          1.00%  277061.87600  [.] main
          0.83%  277061.87600  [.] f1
          0.83%  277061.87600  [.] f2
          1.00%  277061.87700  [.] main
 
 tools headers:
 
   Arnaldo Carvalho de Melo:
 
   - Update x86's syscall_64.tbl, no change in tools/perf behaviour.
 
   -  Sync copies asm-generic/unistd.h and linux/in with the kernel sources.
 
 perf data:
 
   Jiri Olsa:
 
   - Prep work to support having perf.data stored as a directory, with one
     file per CPU, that ultimately will allow having one ring buffer reading
     thread per CPU.
 
 Vendor events:
 
   Martin Liška:
 
   - perf PMU events for AMD Family 17h.
 
 perf script python:
 
   Tony Jones:
 
   - Add python3 support for the remaining Intel PT related scripts, with
     these we should have a clean build of perf with python3 while still
     supporting the build with python2.
 
 libbpf:
 
   Arnaldo Carvalho de Melo:
 
   - Fix the build on uCLibc, adding the missing stdarg.h since we use
     va_list in one typedef.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXIbMlgAKCRCyPKLppCJ+
 J/fzAQDNlP1cEuryAfWCDZ/sf5N/76srvkt/kIyYO0CliCjiBAEAiHRWrhsNs1Gd
 Z8626lCTYt7BTdz5yfTb7gbt/n7xNAY=
 =Ycye
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.1-20190311' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/core improvements and fixes from Arnaldo:

kernel:

  Stephane Eranian :

  - Restore mmap record type correctly when handling PERF_RECORD_MMAP2
    events, as the same template is used for all the threads interested
    in mmap events, some may want just PERF_RECORD_MMAP, while some
    may want the extra info in MMAP2 records.

perf probe:

  Adrian Hunter:

  - Fix getting the kernel map, because since changes related to x86 PTI
    entry trampolines handling, there are more than one kernel map.

perf script:

  Andi Kleen:

  - Support insn output for normal samples, i.e.:

    perf script -F ip,sym,insn --xed

    Will fetch the sample IP from the thread address space and feed it
    to Intel's XED disassembler, producing lines such as:

      ffffffffa4068804 native_write_msr            wrmsr
      ffffffffa415b95e __hrtimer_next_event_base   movq  0x18(%rax), %rdx

    That match 'perf annotate's output.

  - Make the --cpu filter apply to  PERF_RECORD_COMM/FORK/... events, in
    addition to PERF_RECORD_SAMPLE.

perf report:

  - Add a new --samples option to save a small random number of samples
    per hist entry, using a reservoir technique to select a representative
    number of samples.

    Then allow browsing the samples using 'perf script' as part of the hist
    entry context menu. This automatically adds the right filters, so only
    the thread or CPU of the sample is displayed. Then we use less' search
    functionality to directly jump to the time stamp of the selected sample.

    It uses different menus for assembler and source display.  Assembler
    needs xed installed and source needs debuginfo.

  - Fix the UI browser scripts pop up menu when there are many scripts
    available.

perf report:

  Andi Kleen:

  - Add 'time' sort option. E.g.:

    % perf report --sort time,overhead,symbol --time-quantum 1ms --stdio
    ...
         0.67%  277061.87300  [.] _dl_start
         0.50%  277061.87300  [.] f1
         0.50%  277061.87300  [.] f2
         0.33%  277061.87300  [.] main
         0.29%  277061.87300  [.] _dl_lookup_symbol_x
         0.29%  277061.87300  [.] dl_main
         0.29%  277061.87300  [.] do_lookup_x
         0.17%  277061.87300  [.] _dl_debug_initialize
         0.17%  277061.87300  [.] _dl_init_paths
         0.08%  277061.87300  [.] check_match
         0.04%  277061.87300  [.] _dl_count_modids
         1.33%  277061.87400  [.] f1
         1.33%  277061.87400  [.] f2
         1.33%  277061.87400  [.] main
         1.17%  277061.87500  [.] main
         1.08%  277061.87500  [.] f1
         1.08%  277061.87500  [.] f2
         1.00%  277061.87600  [.] main
         0.83%  277061.87600  [.] f1
         0.83%  277061.87600  [.] f2
         1.00%  277061.87700  [.] main

tools headers:

  Arnaldo Carvalho de Melo:

  - Update x86's syscall_64.tbl, no change in tools/perf behaviour.

  -  Sync copies asm-generic/unistd.h and linux/in with the kernel sources.

perf data:

  Jiri Olsa:

  - Prep work to support having perf.data stored as a directory, with one
    file per CPU, that ultimately will allow having one ring buffer reading
    thread per CPU.

Vendor events:

  Martin Liška:

  - perf PMU events for AMD Family 17h.

perf script python:

  Tony Jones:

  - Add python3 support for the remaining Intel PT related scripts, with
    these we should have a clean build of perf with python3 while still
    supporting the build with python2.

libbpf:

  Arnaldo Carvalho de Melo:

  - Fix the build on uCLibc, adding the missing stdarg.h since we use
    va_list in one typedef.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-22 22:50:41 +01:00
Song Liu
34be16466d tools lib bpf: Introduce bpf_program__get_prog_info_linear()
Currently, bpf_prog_info includes 9 arrays. The user has the option to
fetch any combination of these arrays. However, this requires a lot of
handling.

This work becomes more tricky when we need to store bpf_prog_info to a
file, because these arrays are allocated independently.

This patch introduces 'struct bpf_prog_info_linear', which stores arrays
of bpf_prog_info in continuous memory.

Helper functions are introduced to unify the work to get different sets
of bpf_prog_info.  Specifically, bpf_program__get_prog_info_linear()
allows the user to select which arrays to fetch, and handles details for
the user.

Please see the comments right before 'enum bpf_prog_info_array' for more
details and examples.

Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lkml.kernel.org/r/ce92c091-e80d-a0c1-4aa0-987706c42b20@iogearbox.net
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: kernel-team@fb.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: http://lkml.kernel.org/r/20190312053051.2690567-3-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19 16:52:06 -03:00
Changbin Du
11c1ea6f1a perf tools: Fix errors under optimization level '-Og'
Optimization level '-Og' offers a reasonable level of optimization while
maintaining fast compilation and a good debugging experience. This patch
tries to make it work.

  $ make DEBUG=1 EXTRA_CFLAGS='-Og'
  bench/epoll-ctl.c: In function ‘do_threads’:
  bench/epoll-ctl.c:274:9: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    return ret;
           ^~~
  ...

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20190316080556.3075-4-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-19 16:52:04 -03:00
Andrii Nakryiko
9768095ba9 btf: resolve enum fwds in btf_dedup
GCC and clang support enum forward declarations as an extension. Such
forward-declared enums will be represented as normal BTF_KIND_ENUM types with
vlen=0. This patch adds ability to resolve such enums to their corresponding
fully defined enums. This helps to avoid duplicated BTF type graphs which only
differ by some types referencing forward-declared enum vs full enum.

One such example in kernel is enum irqchip_irq_state, defined in
include/linux/interrupt.h and forward-declared in include/linux/irq.h. This
causes entire struct task_struct and all referenced types to be duplicated in
btf_dedup output. This patch eliminates such duplication cases.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-14 13:53:18 -07:00
Magnus Karlsson
6bf21b54a5 libbpf: fix to reject unknown flags in xsk_socket__create()
In xsk_socket__create(), the libbpf_flags field was not checked for
setting currently unused/unknown flags. This patch fixes that by
returning -EINVAL if the user has set any flag that is not in use at
this point in time.

Fixes: 1cad078842 ("libbpf: add support for using AF_XDP sockets")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-12 21:58:18 +01:00
Arnaldo Carvalho de Melo
dfcbc2f299 tools lib bpf: Fix the build by adding a missing stdarg.h include
The libbpf_print_fn_t typedef uses va_list without including the header
where that type is defined, stdarg.h, breaking in places where we're
unlucky for that type not to be already defined by some previously
included header.

Noticed while building on fedora 24 cross building tools/perf to the ARC
architecture using the uClibc C library:

  28 fedora:24-x-ARC-uClibc   : FAIL arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710

    CC       /tmp/build/perf/tests/llvm.o
  In file included from tests/llvm.c:3:0:
  /git/linux/tools/lib/bpf/libbpf.h:57:20: error: unknown type name 'va_list'
        const char *, va_list ap);
                      ^~~~~~~
  /git/linux/tools/lib/bpf/libbpf.h:59:34: error: unknown type name 'libbpf_print_fn_t'
   LIBBPF_API void libbpf_set_print(libbpf_print_fn_t fn);
                                    ^~~~~~~~~~~~~~~~~
  mv: cannot stat '/tmp/build/perf/tests/.llvm.o.tmp': No such file or directory

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Yonghong Song <yhs@fb.com>
Fixes: a8a1f7d09c ("libbpf: fix libbpf_print")
Link: https://lkml.kernel.org/n/tip-5270n2quu2gqz22o7itfdx00@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11 17:14:31 -03:00
Andrii Nakryiko
f38a1f0a5a libbpf: handle BTF parsing and loading properly
This patch splits and cleans up error handling logic for loading BTF data.
Previously, if BTF data was parsed successfully, but failed to load into
kernel, we'd report nonsensical error code, instead of error returned from
btf__load(). Now btf__new() and btf__load() are handled separately with proper
cleanup and warning reporting.

Fixes: d29d87f7e6 ("btf: separate btf creation and loading")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-11 10:14:12 +01:00
Nikita V. Shirokov
243b4cdab9 bpf, libbpf: fixing leak when kernel does not support btf
We could end up in situation when we have object file w/ all btf
info, but kernel does not support btf yet. In this situation
currently libbpf just set obj->btf to NULL w/o freeing it first.
This patch is fixing it by making sure to run btf__free first.

Fixes: d29d87f7e6 ("btf: separate btf creation and loading")
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-08 21:16:36 +01:00
Stanislav Fomichev
8e2688876c libbpf: force fixdep compilation at the start of the build
libbpf targets don't explicitly depend on fixdep target, so when
we do 'make -j$(nproc)', there is a high probability, that some
objects will be built before fixdep binary is available.

Fix this by running sub-make; this makes sure that fixdep dependency
is properly accounted for.

For the same issue in perf, see commit abb26210a3 ("perf tools: Force
fixdep compilation at the start of the build").

Before:

$ rm -rf /tmp/bld; mkdir /tmp/bld; make -j$(nproc) O=/tmp/bld -C tools/lib/bpf/

Auto-detecting system features:
...                        libelf: [ on  ]
...                           bpf: [ on  ]

  HOSTCC   /tmp/bld/fixdep.o
  CC       /tmp/bld/libbpf.o
  CC       /tmp/bld/bpf.o
  CC       /tmp/bld/btf.o
  CC       /tmp/bld/nlattr.o
  CC       /tmp/bld/libbpf_errno.o
  CC       /tmp/bld/str_error.o
  CC       /tmp/bld/netlink.o
  CC       /tmp/bld/bpf_prog_linfo.o
  CC       /tmp/bld/libbpf_probes.o
  CC       /tmp/bld/xsk.o
  HOSTLD   /tmp/bld/fixdep-in.o
  LINK     /tmp/bld/fixdep
  LD       /tmp/bld/libbpf-in.o
  LINK     /tmp/bld/libbpf.a
  LINK     /tmp/bld/libbpf.so
  LINK     /tmp/bld/test_libbpf

$ head /tmp/bld/.libbpf.o.cmd
 # cannot find fixdep (/usr/local/google/home/sdf/src/linux/xxx//fixdep)
 # using basic dep data

/tmp/bld/libbpf.o: libbpf.c /usr/include/stdc-predef.h \
 /usr/include/stdlib.h /usr/include/features.h \
 /usr/include/x86_64-linux-gnu/sys/cdefs.h \
 /usr/include/x86_64-linux-gnu/bits/wordsize.h \
 /usr/include/x86_64-linux-gnu/gnu/stubs.h \
 /usr/include/x86_64-linux-gnu/gnu/stubs-64.h \
 /usr/lib/gcc/x86_64-linux-gnu/7/include/stddef.h \

After:

$ rm -rf /tmp/bld; mkdir /tmp/bld; make -j$(nproc) O=/tmp/bld -C tools/lib/bpf/

Auto-detecting system features:
...                        libelf: [ on  ]
...                           bpf: [ on  ]

  HOSTCC   /tmp/bld/fixdep.o
  HOSTLD   /tmp/bld/fixdep-in.o
  LINK     /tmp/bld/fixdep
  CC       /tmp/bld/libbpf.o
  CC       /tmp/bld/bpf.o
  CC       /tmp/bld/nlattr.o
  CC       /tmp/bld/btf.o
  CC       /tmp/bld/libbpf_errno.o
  CC       /tmp/bld/str_error.o
  CC       /tmp/bld/netlink.o
  CC       /tmp/bld/bpf_prog_linfo.o
  CC       /tmp/bld/libbpf_probes.o
  CC       /tmp/bld/xsk.o
  LD       /tmp/bld/libbpf-in.o
  LINK     /tmp/bld/libbpf.a
  LINK     /tmp/bld/libbpf.so
  LINK     /tmp/bld/test_libbpf

$ head /tmp/bld/.libbpf.o.cmd
cmd_/tmp/bld/libbpf.o := gcc -Wp,-MD,/tmp/bld/.libbpf.o.d -Wp,-MT,/tmp/bld/libbpf.o -g -Wall -DHAVE_LIBELF_MMAP_SUPPORT -DCOMPAT_NEED_REALLOCARRAY -Wbad-function-cast -Wdeclaration-after-statement -Wformat-security -Wformat-y2k -Winit-self -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wno-system-headers -Wold-style-definition -Wpacked -Wredundant-decls -Wshadow -Wstrict-prototypes -Wswitch-default -Wswitch-enum -Wundef -Wwrite-strings -Wformat -Wstrict-aliasing=3 -Werror -Wall -fPIC -I. -I/usr/local/google/home/sdf/src/linux/tools/include -I/usr/local/google/home/sdf/src/linux/tools/arch/x86/include/uapi -I/usr/local/google/home/sdf/src/linux/tools/include/uapi -fvisibility=hidden -D"BUILD_STR(s)=$(pound)s" -c -o /tmp/bld/libbpf.o libbpf.c

source_/tmp/bld/libbpf.o := libbpf.c

deps_/tmp/bld/libbpf.o := \
  /usr/include/stdc-predef.h \
  /usr/include/stdlib.h \
  /usr/include/features.h \
  /usr/include/x86_64-linux-gnu/sys/cdefs.h \
  /usr/include/x86_64-linux-gnu/bits/wordsize.h \

Fixes: 7c422f5572 ("tools build: Build fixdep helper from perf and basic libs")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-07 10:44:41 +01:00
Andrii Nakryiko
91097fbee4 btf: fix bug with resolving STRUCT/UNION into corresponding FWD
When checking available canonical candidates for struct/union algorithm
utilizes btf_dedup_is_equiv to determine if candidate is suitable. This
check is not enough when candidate is corresponding FWD for that
struct/union, because according to equivalence logic they are
equivalent. When it so happens that FWD and STRUCT/UNION end in hashing
to the same bucket, it's possible to create remapping loop from FWD to
STRUCT and STRUCT to same FWD, which will cause btf_dedup() to loop
forever.

This patch fixes the issue by additionally checking that type and
canonical candidate are strictly equal (utilizing btf_equal_struct).

Fixes: d5caef5b56 ("btf: add BTF types deduplication algorithm")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-01 01:31:48 +01:00
Andrii Nakryiko
51edf5f6e0 btf: allow to customize dedup hash table size
Default size of dedup table (16k) is good enough for most binaries, even
typical vmlinux images. But there are cases of binaries with huge amount
of BTF types (e.g., allyesconfig variants of kernel), which benefit from
having bigger dedup table size to lower amount of unnecessary hash
collisions. Tools like pahole, thus, can tune this parameter to reach
optimal performance.

This change also serves double purpose of allowing tests to force hash
collisions to test some corner cases, used in follow up patch.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-01 01:31:47 +01:00
Andrii Nakryiko
1baabdc108 libbpf: fix formatting for btf_ext__get_raw_data
Fix invalid formatting of pointer arg.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-01 01:31:47 +01:00
Dan Carpenter
3d8669e637 tools/libbpf: signedness bug in btf_dedup_ref_type()
The "ref_type_id" variable needs to be signed for the error handling
to work.

Fixes: d5caef5b56 ("btf: add BTF types deduplication algorithm")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-01 00:56:06 +01:00
Jakub Kicinski
771744f9dc tools: libbpf: make sure readelf shows full names in build checks
readelf truncates its output by default to attempt to make it more
readable.  This can lead to function names getting aliased if they
differ late in the string.  Use --wide parameter to avoid
truncation.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-01 00:53:46 +01:00
Jakub Kicinski
f74a53d9a5 tools: libbpf: add a correctly named define for map iteration
For historical reasons the helper to loop over maps in an object
is called bpf_map__for_each while it really should be called
bpf_object__for_each_map.  Rename and add a correctly named
define for backward compatibility.

Switch all in-tree users to the correct name (Quentin).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-01 00:53:45 +01:00
Magnus Karlsson
1cad078842 libbpf: add support for using AF_XDP sockets
This commit adds AF_XDP support to libbpf. The main reason for this is
to facilitate writing applications that use AF_XDP by offering
higher-level APIs that hide many of the details of the AF_XDP
uapi. This is in the same vein as libbpf facilitates XDP adoption by
offering easy-to-use higher level interfaces of XDP
functionality. Hopefully this will facilitate adoption of AF_XDP, make
applications using it simpler and smaller, and finally also make it
possible for applications to benefit from optimizations in the AF_XDP
user space access code. Previously, people just copied and pasted the
code from the sample application into their application, which is not
desirable.

The interface is composed of two parts:

* Low-level access interface to the four rings and the packet
* High-level control plane interface for creating and setting
  up umems and af_xdp sockets as well as a simple XDP program.

Tested-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-25 23:21:42 +01:00
Andrii Nakryiko
5aab392c55 tools/libbpf: support bigger BTF data sizes
While it's understandable why kernel limits number of BTF types to 65535
and size of string section to 64KB, in libbpf as user-space library it's
too restrictive. E.g., pahole converting DWARF to BTF type information
for Linux kernel generates more than 3 million BTF types and more than
3MB of strings, before deduplication. So to allow btf__dedup() to do its
work, we need to be able to load bigger BTF sections using btf__new().

Singed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-16 18:47:18 -08:00
Andrey Ignatov
789f6bab84 libbpf: Introduce bpf_object__btf
Add new accessor for bpf_object to get opaque struct btf * from it.

struct btf * is needed for all operations with BTF and it's present in
bpf_object. The only thing missing is a way to get it.

Example use-case is to get BTF key_type_id and value_type_id for a map in
bpf_object. It can be done with btf__get_map_kv_tids() but that function
requires struct btf *.

Similar API can be added for struct btf_ext but no use-case for it yet.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-15 15:20:54 +01:00
Andrey Ignatov
1a11a4c74f libbpf: Introduce bpf_map__resize
Add bpf_map__resize() to change max_entries for a map.

Quite often necessary map size is unknown at compile time and can be
calculated only at run time.

Currently the following approach is used to do so:
* bpf_object__open_buffer() to open Elf file from a buffer;
* bpf_object__find_map_by_name() to find relevant map;
* bpf_map__def() to get map attributes and create struct
  bpf_create_map_attr from them;
* update max_entries in bpf_create_map_attr;
* bpf_create_map_xattr() to create new map with updated max_entries;
* bpf_map__reuse_fd() to replace the map in bpf_object with newly
  created one.

And after all this bpf_object can finally be loaded. The map will have
new size.

It 1) is quite a lot of steps; 2) doesn't take BTF into account.

For "2)" even more steps should be made and some of them require changes
to libbpf (e.g. to get struct btf * from bpf_object).

Instead the whole problem can be solved by introducing simple
bpf_map__resize() API that checks the map and sets new max_entries if
the map is not loaded yet.

So the new steps are:
* bpf_object__open_buffer() to open Elf file from a buffer;
* bpf_object__find_map_by_name() to find relevant map;
* bpf_map__resize() to update max_entries.

That's much simpler and works with BTF.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-15 15:20:42 +01:00
Andrii Nakryiko
1ad9cbb890 tools/bpf: replace bzero with memset
bzero() call is deprecated and superseded by memset().

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reported-by: David Laight <david.laight@aculab.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-14 15:31:39 -08:00
Andrii Nakryiko
49b57e0d01 tools/bpf: remove btf__get_strings() superseded by raw data API
Now that we have btf__get_raw_data() it's trivial for tests to iterate
over all strings for testing purposes, which eliminates the need for
btf__get_strings() API.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-08 12:04:13 -08:00
Andrii Nakryiko
ae4ab4b411 btf: expose API to work with raw btf_ext data
This patch changes struct btf_ext to retain original data in sequential
block of memory, which makes it possible to expose
btf_ext__get_raw_data() interface similar to btf__get_raw_data(), allowing
users of libbpf to get access to raw representation of .BTF.ext section.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-08 12:04:13 -08:00
Andrii Nakryiko
02c874460f btf: expose API to work with raw btf data
This patch exposes new API btf__get_raw_data() that allows to get a copy
of raw BTF data out of struct btf. This is useful for external programs
that need to manipulate raw data, e.g., pahole using btf__dedup() to
deduplicate BTF type info and then writing it back to file.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-08 12:04:13 -08:00
Andrii Nakryiko
d29d87f7e6 btf: separate btf creation and loading
This change splits out previous btf__new functionality of constructing
struct btf and loading it into kernel into two:
- btf__new() just creates and initializes struct btf
- btf__load() attempts to load existing struct btf into kernel

btf__free will still close BTF fd, if it was ever loaded successfully
into kernel.

This change allows users of libbpf to manipulate BTF using its API,
without the need to unnecessarily load it into kernel.

One of the intended use cases is pahole, which will do DWARF to BTF
conversion and then use libbpf to do type deduplication, while then
handling ELF sections overwriting and other concerns on its own.

Fixes: 2d3feca8c4 ("bpf: btf: print map dump and lookup with btf info")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-08 12:04:13 -08:00
Yonghong Song
a4021a3579 tools/bpf: add log_level to bpf_load_program_attr
The kernel verifier has three levels of logs:
    0: no logs
    1: logs mostly useful
  > 1: verbose

Current libbpf API functions bpf_load_program_xattr() and
bpf_load_program() cannot specify log_level.
The bcc, however, provides an interface for user to
specify log_level 2 for verbose output.

This patch added log_level into structure
bpf_load_program_attr, so users, including bcc, can use
bpf_load_program_xattr() to change log_level. The
supported log_level is 0, 1, and 2.

The bpf selftest test_sock.c is modified to enable log_level = 2.
If the "verbose" in test_sock.c is changed to true,
the test will output logs like below:
  $ ./test_sock
  func#0 @0
  0: R1=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
  0: (bf) r6 = r1
  1: R1=ctx(id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
  1: (61) r7 = *(u32 *)(r6 +28)
  invalid bpf_context access off=28 size=4

  Test case: bind4 load with invalid access: src_ip6 .. [PASS]
  ...
  Test case: bind6 allow all .. [PASS]
  Summary: 16 PASSED, 0 FAILED

Some test_sock tests are negative tests and verbose verifier
log will be printed out as shown in the above.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-07 18:22:31 -08:00
Andrii Nakryiko
62b8cea62e tools/bpf: add missing strings.h include
Few files in libbpf are using bzero() function (defined in strings.h header), but
don't include corresponding header. When libbpf is added as a dependency to pahole,
this undeterministically causes warnings on some machines:

bpf.c:225:2: warning: implicit declaration of function 'bzero' [-Wimplicit-function-declaration]
  bzero(&attr, sizeof(attr));
    ^~~~~

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-07 18:18:42 -08:00
Yonghong Song
f7748e2952 tools/bpf: silence a libbpf unnecessary warning
Commit 96408c4344 ("tools/bpf: implement libbpf
btf__get_map_kv_tids() API function") refactored
function bpf_map_find_btf_info() and moved bulk of
implementation into btf.c as btf__get_map_kv_tids().
This change introduced a bug such that test_btf will
print out the following warning although the test passed:
  BTF libbpf test[2] (test_btf_nokv.o): libbpf: map:btf_map
      container_name:____btf_map_btf_map cannot be found
      in BTF. Missing BPF_ANNOTATE_KV_PAIR?

Previously, the error message is guarded with pr_debug().
Commit 96408c4344 changed it to pr_warning() and
hence caused the warning.

Restoring to pr_debug() for the message fixed the issue.

Fixes: 96408c4344 ("tools/bpf: implement libbpf btf__get_map_kv_tids() API function")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-05 22:07:03 -08:00
Yonghong Song
a6c109a6b7 tools/bpf: add const qualifier to btf__get_map_kv_tids() map_name parameter
Commit 96408c4344 ("tools/bpf: implement libbpf btf__get_map_kv_tids() API function")
added the API function btf__get_map_kv_tids():
  btf__get_map_kv_tids(const struct btf *btf, char *map_name, ...)

The parameter map_name has type "char *". This is okay inside libbpf library since
the map_name is from bpf_map->name which also has type "char *".

This will be problematic if the caller for map_name already has attribute "const",
e.g., from C++ string.c_str(). It will result in either a warning or an error.

  /home/yhs/work/bcc/src/cc/btf.cc:166:51:
    error: invalid conversion from ‘const char*’ to ‘char*’ [-fpermissive]
      return btf__get_map_kv_tids(btf_, map_name.c_str()

This patch added "const" attributes to map_name parameter.

Fixes: 96408c4344 ("tools/bpf: implement libbpf btf__get_map_kv_tids() API function")
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-05 18:38:58 -08:00
Andrii Nakryiko
9c65112744 selftests/btf: add initial BTF dedup tests
This patch sets up a new kind of tests (BTF dedup tests) and tests few aspects of
BTF dedup algorithm. More complete set of tests will come in follow up patches.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05 16:52:57 +01:00
Andrii Nakryiko
d5caef5b56 btf: add BTF types deduplication algorithm
This patch implements BTF types deduplication algorithm. It allows to
greatly compress typical output of pahole's DWARF-to-BTF conversion or
LLVM's compilation output by detecting and collapsing identical types emitted in
isolation per compilation unit. Algorithm also resolves struct/union forward
declarations into concrete BTF types representing referenced struct/union. If
undesired, this resolution can be disabled through specifying corresponding options.

Algorithm itself and its application to Linux kernel's BTF types is
described in details at:
https://facebookmicrosites.github.io/bpf/blog/2018/11/14/btf-enhancement.html

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05 16:52:57 +01:00
Andrii Nakryiko
69eaab04c6 btf: extract BTF type size calculation
This pre-patch extracts calculation of amount of space taken by BTF type descriptor
for later reuse by btf_dedup functionality.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-05 16:52:57 +01:00
Stanislav Fomichev
a8a1f7d09c libbpf: fix libbpf_print
With the recent print rework we now have the following problem:
pr_{warning,info,debug} expand to __pr which calls libbpf_print.
libbpf_print does va_start and calls __libbpf_pr with va_list argument.
In __base_pr we again do va_start. Because the next argument is a
va_list, we don't get correct pointer to the argument (and print noting
in my case, I don't know why it doesn't crash tbh).

Fix this by changing libbpf_print_fn_t signature to accept va_list and
remove unneeded calls to va_start in the existing users.

Alternatively, this can we solved by exporting __libbpf_pr and
changing __pr macro to (and killing libbpf_print):
{
	if (__libbpf_pr)
		__libbpf_pr(level, "libbpf: " fmt, ##__VA_ARGS__)
}

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04 17:45:31 -08:00
Yonghong Song
96408c4344 tools/bpf: implement libbpf btf__get_map_kv_tids() API function
Currently, to get map key/value type id's, the macro
  BPF_ANNOTATE_KV_PAIR(<map_name>, <key_type>, <value_type>)
needs to be defined in the bpf program for the
corresponding map.

During program/map loading time,
the local static function bpf_map_find_btf_info()
in libbpf.c is implemented to retrieve the key/value
type ids given the map name.

The patch refactored function bpf_map_find_btf_info()
to create an API btf__get_map_kv_tids() which includes
the bulk of implementation for the original function.
The API btf__get_map_kv_tids() can be used by bcc,
a JIT based bpf compilation system, which uses the
same BPF_ANNOTATE_KV_PAIR to record map key/value types.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04 12:48:36 -08:00
Yonghong Song
b8dcf8d149 tools/bpf: expose functions btf_ext__* as API functions
The following set of functions, which manipulates .BTF.ext
section, are exposed as API functions:
  . btf_ext__new
  . btf_ext__free
  . btf_ext__reloc_func_info
  . btf_ext__reloc_line_info
  . btf_ext__func_info_rec_size
  . btf_ext__line_info_rec_size

These functions are useful for JIT based bpf codegen, e.g.,
bcc, to manipulate in-memory .BTF.ext sections.

The signature of function btf_ext__reloc_func_info()
is also changed to be the same as its definition in btf.c.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04 12:48:36 -08:00
Yonghong Song
6f1ae8b662 tools/bpf: simplify libbpf API function libbpf_set_print()
Currently, the libbpf API function libbpf_set_print()
takes three function pointer parameters for warning, info
and debug printout respectively.

This patch changes the API to have just one function pointer
parameter and the function pointer has one additional
parameter "debugging level". So if in the future, if
the debug level is increased, the function signature
won't change.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04 09:40:59 -08:00
Yonghong Song
9d100a19ff tools/bpf: print out btf log at LIBBPF_WARN level
Currently, the btf log is allocated and printed out in case
of error at LIBBPF_DEBUG level.
Such logs from kernel are very important for debugging.
For example, bpf syscall BPF_PROG_LOAD command can get
verifier logs back to user space. In function load_program()
of libbpf.c, the log buffer is allocated unconditionally
and printed out at pr_warning() level.

Let us do the similar thing here for btf. Allocate buffer
unconditionally and print out error logs at pr_warning() level.
This can reduce one global function and
optimize for common situations where pr_warning()
is activated either by default or by user supplied
debug output function.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04 09:40:58 -08:00
Yonghong Song
8461ef8b7e tools/bpf: move libbpf pr_* debug print functions to headers
A global function libbpf_print, which is invisible
outside the shared library, is defined to print based
on levels. The pr_warning, pr_info and pr_debug
macros are moved into the newly created header
common.h. So any .c file including common.h can
use these macros directly.

Currently btf__new and btf_ext__new API has an argument getting
__pr_debug function pointer into btf.c so the debugging information
can be printed there. This patch removed this parameter
from btf__new and btf_ext__new and directly using pr_debug in btf.c.

Another global function libbpf_print_level_available, also
invisible outside the shared library, can test
whether a particular level debug printing is
available or not. It is used in btf.c to
test whether DEBUG level debug printing is availabl or not,
based on which the log buffer will be allocated when loading
btf to the kernel.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04 09:40:58 -08:00
Maciej Fijalkowski
50db9f0731 libbpf: Add a support for getting xdp prog id on ifindex
Since we have a dedicated netlink attributes for xdp setup on a
particular interface, it is now possible to retrieve the program id that
is currently attached to the interface. The use case is targeted for
sample xdp programs, which will store the program id just after loading
bpf program onto iface. On shutdown, the sample will make sure that it
can unload the program by querying again the iface and verifying that
both program id's matches.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 23:37:51 +01:00
Maciej Fijalkowski
f3cea32d56 libbpf: Add a helper for retrieving a map fd for a given name
XDP samples are mostly cooperating with eBPF maps through their file
descriptors. In case of a eBPF program that contains multiple maps it
might be tiresome to iterate through them and call bpf_map__fd for each
one. Add a helper mostly based on bpf_object__find_map_by_name, but
instead of returning the struct bpf_map pointer, return map fd.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 23:37:50 +01:00
Alexei Starovoitov
df5d22facd libbpf: introduce bpf_map_lookup_elem_flags()
Introduce
int bpf_map_lookup_elem_flags(int fd, const void *key, void *value, __u64 flags)
helper to lookup array/hash/cgroup_local_storage elements with BPF_F_LOCK flag.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-01 20:55:39 +01:00
David S. Miller
ec7146db15 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Teach verifier dead code removal, this also allows for optimizing /
   removing conditional branches around dead code and to shrink the
   resulting image. Code store constrained architectures like nfp would
   have hard time doing this at JIT level, from Jakub.

2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
   code generation for 32-bit sub-registers. Evaluation shows that this
   can result in code reduction of ~5-20% compared to 64 bit-only code
   generation. Also add implementation for most JITs, from Jiong.

3) Add support for __int128 types in BTF which is also needed for
   vmlinux's BTF conversion to work, from Yonghong.

4) Add a new command to bpftool in order to dump a list of BPF-related
   parameters from the system or for a specific network device e.g. in
   terms of available prog/map types or helper functions, from Quentin.

5) Add AF_XDP sock_diag interface for querying sockets from user
   space which provides information about the RX/TX/fill/completion
   rings, umem, memory usage etc, from Björn.

6) Add skb context access for skb_shared_info->gso_segs field, from Eric.

7) Add support for testing flow dissector BPF programs by extending
   existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.

8) Split BPF kselftest's test_verifier into various subgroups of tests
   in order better deal with merge conflicts in this area, from Jakub.

9) Add support for queue/stack manipulations in bpftool, from Stanislav.

10) Document BTF, from Yonghong.

11) Dump supported ELF section names in libbpf on program load
    failure, from Taeung.

12) Silence a false positive compiler warning in verifier's BTF
    handling, from Peter.

13) Fix help string in bpftool's feature probing, from Prashant.

14) Remove duplicate includes in BPF kselftests, from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 19:38:33 -08:00
Taeung Song
c76e4c228b libbpf: Show supported ELF section names when failing to guess prog/attach type
We need to let users check their wrong ELF section name with proper
ELF section names when they fail to get a prog/attach type from it.
Because users can't realize libbpf guess prog/attach types from given
ELF section names. For example, when a 'cgroup' section name of a
BPF program is used, show available ELF section names(types).

Before:

    $ bpftool prog load bpf-prog.o /sys/fs/bpf/prog1
    Error: failed to guess program type based on ELF section name cgroup

After:

    libbpf: failed to guess program type based on ELF section name 'cgroup'
    libbpf: supported section(type) names are: socket kprobe/ kretprobe/ classifier action tracepoint/ raw_tracepoint/ xdp perf_event lwt_in lwt_out lwt_xmit lwt_seg6local cgroup_skb/ingress cgroup_skb/egress cgroup/skb cgroup/sock cgroup/post_bind4 cgroup/post_bind6 cgroup/dev sockops sk_skb/stream_parser sk_skb/stream_verdict sk_skb sk_msg lirc_mode2 flow_dissector cgroup/bind4 cgroup/bind6 cgroup/connect4 cgroup/connect6 cgroup/sendmsg4 cgroup/sendmsg6

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Andrey Ignatov <rdna@fb.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-23 12:27:04 +01:00
Quentin Monnet
2d3ea5e85d tools: bpftool: add probes for eBPF helper functions
Similarly to what was done for program types and map types, add a set of
probes to test the availability of the different eBPF helper functions
on the current system.

For each known program type, all known helpers are tested, in order to
establish a compatibility matrix. Output is provided as a set of lists
of available helpers, one per program type.

Sample output:

    # bpftool feature probe kernel
    ...
    Scanning eBPF helper functions...
    eBPF helpers supported for program type socket_filter:
            - bpf_map_lookup_elem
            - bpf_map_update_elem
            - bpf_map_delete_elem
    ...
    eBPF helpers supported for program type kprobe:
            - bpf_map_lookup_elem
            - bpf_map_update_elem
            - bpf_map_delete_elem
    ...

    # bpftool --json --pretty feature probe kernel
    {
        ...
        "helpers": {
            "socket_filter_available_helpers": ["bpf_map_lookup_elem", \
                    "bpf_map_update_elem","bpf_map_delete_elem", ...
            ],
            "kprobe_available_helpers": ["bpf_map_lookup_elem", \
                    "bpf_map_update_elem","bpf_map_delete_elem", ...
            ],
            ...
        }
    }

v5:
- In libbpf.map, move global symbol to the new LIBBPF_0.0.2 section.

v4:
- Use "enum bpf_func_id" instead of "__u32" in bpf_probe_helper()
  declaration for the type of the argument used to pass the id of
  the helper to probe.
- Undef BPF_HELPER_MAKE_ENTRY after using it.

v3:
- Do not pass kernel version from bpftool to libbpf probes (kernel
  version for testing program with kprobes is retrieved directly from
  libbpf).
- Dump one list of available helpers per program type (instead of one
  list of compatible program types per helper).

v2:
- Move probes from bpftool to libbpf.
- Test all program types for each helper, print a list of working prog
  types for each helper.
- Fall back on include/uapi/linux/bpf.h for names and ids of helpers.
- Remove C-style macros output from this patch.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-22 22:15:40 -08:00
Quentin Monnet
f99e166397 tools: bpftool: add probes for eBPF map types
Add new probes for eBPF map types, to detect what are the ones available
on the system. Try creating one map of each type, and see if the kernel
complains.

Sample output:

    # bpftool feature probe kernel
    ...
    Scanning eBPF map types...
    eBPF map_type hash is available
    eBPF map_type array is available
    eBPF map_type prog_array is available
    ...

    # bpftool --json --pretty feature probe kernel
    {
        ...
        "map_types": {
            "have_hash_map_type": true,
            "have_array_map_type": true,
            "have_prog_array_map_type": true,
            ...
        }
    }

v5:
- In libbpf.map, move global symbol to the new LIBBPF_0.0.2 section.

v3:
- Use a switch with all enum values for setting specific map parameters,
  so that gcc complains at compile time (-Wswitch-enum) if new map types
  were added to the kernel but libbpf was not updated.

v2:
- Move probes from bpftool to libbpf.
- Remove C-style macros output from this patch.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-22 22:15:40 -08:00
Quentin Monnet
1bf4b05810 tools: bpftool: add probes for eBPF program types
Introduce probes for supported BPF program types in libbpf, and call it
from bpftool to test what types are available on the system. The probe
simply consists in loading a very basic program of that type and see if
the verifier complains or not.

Sample output:

    # bpftool feature probe kernel
    ...
    Scanning eBPF program types...
    eBPF program_type socket_filter is available
    eBPF program_type kprobe is available
    eBPF program_type sched_cls is available
    ...

    # bpftool --json --pretty feature probe kernel
    {
        ...
        "program_types": {
            "have_socket_filter_prog_type": true,
            "have_kprobe_prog_type": true,
            "have_sched_cls_prog_type": true,
            ...
        }
    }

v5:
- In libbpf.map, move global symbol to a new LIBBPF_0.0.2 section.
- Rename (non-API function) prog_load() as probe_load().

v3:
- Get kernel version for checking kprobes availability from libbpf
  instead of from bpftool. Do not pass kernel_version as an argument
  when calling libbpf probes.
- Use a switch with all enum values for setting specific program
  parameters just before probing, so that gcc complains at compile time
  (-Wswitch-enum) if new prog types were added to the kernel but libbpf
  was not updated.
- Add a comment in libbpf.h about setrlimit() usage to allow many
  consecutive probe attempts.

v2:
- Move probes from bpftool to libbpf.
- Remove C-style macros output from this patch.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-22 22:15:40 -08:00
Stanislav Fomichev
eeedd3527d libbpf: don't define CC and AR
We are already including tools/scripts/Makefile.include which correctly
handles CROSS_COMPILE, no need to define our own vars.

See related commit 7ed1c1901f ("tools: fix cross-compile var clobbering")
for more details.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-16 22:47:43 +01:00
Lorenz Bauer
86edaed379 bpf: libbpf: retry loading program on EAGAIN
Commit c3494801cd ("bpf: check pending signals while
verifying programs") makes it possible for the BPF_PROG_LOAD
to fail with EAGAIN. Retry unconditionally in this case.

Fixes: c3494801cd ("bpf: check pending signals while verifying programs")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-15 21:35:34 +01:00
Stanislav Fomichev
e3ca63de8a selftests/bpf: add missing executables to .gitignore
We build test_libbpf with CXX to make sure linking against C++ works.

$ make -s -C tools/lib/bpf
$ git status -sb
? tools/lib/bpf/test_libbpf
$ make -s -C tools/testing/selftests/bpf
$ git status -sb
? tools/lib/bpf/test_libbpf
? tools/testing/selftests/bpf/test_libbpf

Fixes: 8c4905b995 ("libbpf: make sure bpf headers are c++ include-able")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-10 15:53:02 +01:00
Daniel Borkmann
80f21ff987 bpf, doc: add note for libbpf's stand-alone build
Given this came up couple of times, add a note to libbpf's readme
about the semi-automated mirror for a stand-alone build which is
officially managed by BPF folks. While at it, also explicitly state
the libbpf license in the readme file.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-07 15:52:00 -08:00
Prashant Bhole
07a09d1b73 bpf: libbpf: fix memleak by freeing line_info
This patch fixes a memory leak in libbpf by freeing up line_info
member of struct bpf_program while unloading a program.

Fixes: 3d65014146 ("bpf: libbpf: Add btf_line_info support to libbpf")
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-18 01:16:23 +01:00
Martin KaFai Lau
177e77169b bpf: Remove !func_info and !line_info check from test_btf and bpftool
kernel can provide the func_info and line_info even
it fails the btf_dump_raw_ok() test because they don't contain
kernel address.  This patch removes the corresponding '== 0'
test.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-13 12:16:31 +01:00
Yonghong Song
cfc542411b tools/bpf: rename *_info_cnt to nr_*_info
Rename all occurances of *_info_cnt field access
to nr_*_info in tools directory.

The local variables finfo_cnt, linfo_cnt and jited_linfo_cnt
in function do_dump() of tools/bpf/bpftool/prog.c are also
changed to nr_finfo, nr_linfo and nr_jited_linfo to
keep naming convention consistent.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-10 14:51:45 -08:00
Martin KaFai Lau
b053b439b7 bpf: libbpf: bpftool: Print bpf_line_info during prog dump
This patch adds print bpf_line_info function in 'prog dump jitted'
and 'prog dump xlated':

[root@arch-fb-vm1 bpf]# ~/devshare/fb-kernel/linux/tools/bpf/bpftool/bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv
[...]
int test_long_fname_2(struct dummy_tracepoint_args * arg):
bpf_prog_44a040bf25481309_test_long_fname_2:
; static int test_long_fname_2(struct dummy_tracepoint_args *arg)
   0:	push   %rbp
   1:	mov    %rsp,%rbp
   4:	sub    $0x30,%rsp
   b:	sub    $0x28,%rbp
   f:	mov    %rbx,0x0(%rbp)
  13:	mov    %r13,0x8(%rbp)
  17:	mov    %r14,0x10(%rbp)
  1b:	mov    %r15,0x18(%rbp)
  1f:	xor    %eax,%eax
  21:	mov    %rax,0x20(%rbp)
  25:	xor    %esi,%esi
; int key = 0;
  27:	mov    %esi,-0x4(%rbp)
; if (!arg->sock)
  2a:	mov    0x8(%rdi),%rdi
; if (!arg->sock)
  2e:	cmp    $0x0,%rdi
  32:	je     0x0000000000000070
  34:	mov    %rbp,%rsi
; counts = bpf_map_lookup_elem(&btf_map, &key);
  37:	add    $0xfffffffffffffffc,%rsi
  3b:	movabs $0xffff8881139d7480,%rdi
  45:	add    $0x110,%rdi
  4c:	mov    0x0(%rsi),%eax
  4f:	cmp    $0x4,%rax
  53:	jae    0x000000000000005e
  55:	shl    $0x3,%rax
  59:	add    %rdi,%rax
  5c:	jmp    0x0000000000000060
  5e:	xor    %eax,%eax
; if (!counts)
  60:	cmp    $0x0,%rax
  64:	je     0x0000000000000070
; counts->v6++;
  66:	mov    0x4(%rax),%edi
  69:	add    $0x1,%rdi
  6d:	mov    %edi,0x4(%rax)
  70:	mov    0x0(%rbp),%rbx
  74:	mov    0x8(%rbp),%r13
  78:	mov    0x10(%rbp),%r14
  7c:	mov    0x18(%rbp),%r15
  80:	add    $0x28,%rbp
  84:	leaveq
  85:	retq
[...]

With linum:
[root@arch-fb-vm1 bpf]# ~/devshare/fb-kernel/linux/tools/bpf/bpftool/bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv linum
int _dummy_tracepoint(struct dummy_tracepoint_args * arg):
bpf_prog_b07ccb89267cf242__dummy_tracepoint:
; return test_long_fname_1(arg); [file:/data/users/kafai/fb-kernel/linux/tools/testing/selftests/bpf/test_btf_haskv.c line_num:54 line_col:9]
   0:	push   %rbp
   1:	mov    %rsp,%rbp
   4:	sub    $0x28,%rsp
   b:	sub    $0x28,%rbp
   f:	mov    %rbx,0x0(%rbp)
  13:	mov    %r13,0x8(%rbp)
  17:	mov    %r14,0x10(%rbp)
  1b:	mov    %r15,0x18(%rbp)
  1f:	xor    %eax,%eax
  21:	mov    %rax,0x20(%rbp)
  25:	callq  0x000000000000851e
; return test_long_fname_1(arg); [file:/data/users/kafai/fb-kernel/linux/tools/testing/selftests/bpf/test_btf_haskv.c line_num:54 line_col:2]
  2a:	xor    %eax,%eax
  2c:	mov    0x0(%rbp),%rbx
  30:	mov    0x8(%rbp),%r13
  34:	mov    0x10(%rbp),%r14
  38:	mov    0x18(%rbp),%r15
  3c:	add    $0x28,%rbp
  40:	leaveq
  41:	retq
[...]

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09 13:54:38 -08:00
Martin KaFai Lau
3d65014146 bpf: libbpf: Add btf_line_info support to libbpf
This patch adds bpf_line_info support to libbpf:
1) Parsing the line_info sec from ".BTF.ext"
2) Relocating the line_info.  If the main prog *_info relocation
   fails, it will ignore the remaining subprog line_info and continue.
   If the subprog *_info relocation fails, it will bail out.
3) BPF_PROG_LOAD a prog with line_info

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09 13:54:38 -08:00
Martin KaFai Lau
f0187f0b17 bpf: libbpf: Refactor and bug fix on the bpf_func_info loading logic
This patch refactor and fix a bug in the libbpf's bpf_func_info loading
logic.  The bug fix and refactoring are targeting the same
commit 2993e0515b ("tools/bpf: add support to read .BTF.ext sections")
which is in the bpf-next branch.

1) In bpf_load_program_xattr(), it should retry when errno == E2BIG
   regardless of log_buf and log_buf_sz.  This patch fixes it.

2) btf_ext__reloc_init() and btf_ext__reloc() are essentially
   the same except btf_ext__reloc_init() always has insns_cnt == 0.
   Hence, btf_ext__reloc_init() is removed.

   btf_ext__reloc() is also renamed to btf_ext__reloc_func_info()
   to get ready for the line_info support in the next patch.

3) Consolidate func_info section logic from "btf_ext_parse_hdr()",
   "btf_ext_validate_func_info()" and "btf_ext__new()" to
   a new function "btf_ext_copy_func_info()" such that similar
   logic can be reused by the later libbpf's line_info patch.

4) The next line_info patch will store line_info_cnt instead of
   line_info_len in the bpf_program because the kernel is taking
   line_info_cnt also.  It will save a few "len" to "cnt" conversions
   and will also save some function args.

   Hence, this patch also makes bpf_program to store func_info_cnt
   instead of func_info_len.

5) btf_ext depends on btf.  e.g. the func_info's type_id
   in ".BTF.ext" is not useful when ".BTF" is absent.
   This patch only init the obj->btf_ext pointer after
   it has successfully init the obj->btf pointer.

   This can avoid always checking "obj->btf && obj->btf_ext"
   together for accessing ".BTF.ext".  Checking "obj->btf_ext"
   alone will do.

6) Move "struct btf_sec_func_info" from btf.h to btf.c.
   There is no external usage outside btf.c.

Fixes: 2993e0515b ("tools/bpf: add support to read .BTF.ext sections")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-09 13:54:38 -08:00
Martin KaFai Lau
84ecc1f98c bpf: Expect !info.func_info and insn_off name changes in test_btf/libbpf/bpftool
Similar to info.jited_*, info.func_info could be 0 if
bpf_dump_raw_ok() == false.

This patch makes changes to test_btf and bpftool to expect info.func_info
could be 0.

This patch also makes the needed changes for s/insn_offset/insn_off/.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-05 18:48:40 -08:00
Lorenz Bauer
64a975913b libbpf: add bpf_prog_test_run_xattr
Add a new function, which encourages safe usage of the test interface.
bpf_prog_test_run continues to work as before, but should be considered
unsafe.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-04 08:18:13 -08:00
Andrey Ignatov
de94b651ee libbpf: Fix license in README.rst
The whole libbpf is licensed as (LGPL-2.1 OR BSD-2-Clause). I missed it
while adding README.rst. Fix it and use same license as all other files
in libbpf do. Since I'm the only author of README.rst so far, no others'
permissions should be needed.

Fixes: 76d1b894c5 ("libbpf: Document API and ABI conventions")
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-03 21:36:38 +01:00
David Miller
e9ee9efc0d bpf: Add BPF_F_ANY_ALIGNMENT.
Often we want to write tests cases that check things like bad context
offset accesses.  And one way to do this is to use an odd offset on,
for example, a 32-bit load.

This unfortunately triggers the alignment checks first on platforms
that do not set CONFIG_EFFICIENT_UNALIGNED_ACCESS.  So the test
case see the alignment failure rather than what it was testing for.

It is often not completely possible to respect the original intention
of the test, or even test the same exact thing, while solving the
alignment issue.

Another option could have been to check the alignment after the
context and other validations are performed by the verifier, but
that is a non-trivial change to the verifier.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-30 21:38:48 -08:00
Yonghong Song
b42699547f tools/bpf: make libbpf _GNU_SOURCE friendly
During porting libbpf to bcc, I got some warnings like below:
  ...
  [  2%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf.c.o
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf.c:12:0:
  warning: "_GNU_SOURCE" redefined [enabled by default]
   #define _GNU_SOURCE
  ...
  [  3%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf_errno.c.o
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c: In function ‘libbpf_strerror’:
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c:45:7:
  warning: assignment makes integer from pointer without a cast [enabled by default]
     ret = strerror_r(err, buf, size);
  ...

bcc is built with _GNU_SOURCE defined and this caused the above warning.
This patch intends to make libpf _GNU_SOURCE friendly by
  . define _GNU_SOURCE in libbpf.c unless it is not defined
  . undefine _GNU_SOURCE as non-gnu version of strerror_r is expected.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-30 02:41:02 +01:00
David Miller
1ad93ab10e bpf: Fix various lib and testsuite build failures on 32-bit.
Cannot cast a u64 to a pointer on 32-bit without an intervening (long)
cast otherwise GCC warns.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-28 16:10:59 -08:00
Andrey Ignatov
76d1b894c5 libbpf: Document API and ABI conventions
Document API and ABI for libbpf: naming convention, symbol visibility,
ABI versioning.

This is just a starting point. Documentation can be significantly
extended in the future to cover more topics.

ABI versioning section touches only a few basic points with a link to
more comprehensive documentation from Ulrich Drepper. This section can
be extended in the future when there is better understanding what works
well and what not so well in libbpf development process and production
usage.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26 18:57:14 -08:00
Andrey Ignatov
306b267cb3 libbpf: Verify versioned symbols
Since ABI versioning info is kept separately from the code it's easy to
forget to update it while adding a new API.

Add simple verification that all global symbols exported with LIBBPF_API
are versioned in libbpf.map version script.

The idea is to check that number of global symbols in libbpf-in.o, that
is the input to the linker, matches with number of unique versioned
symbols in libbpf.so, that is the output of the linker. If these numbers
don't match, it may mean some symbol was not versioned and make will
fail.

"Unique" means that if a symbol is present in more than one version of
ABI due to ABI changes, it'll be counted once.

Another option to calculate number of global symbols in the "input"
could be to count number of LIBBPF_ABI entries in C headers but it seems
to be fragile.

Example of output when a symbol is missing in version script:

    ...
    LD       libbpf-in.o
    LINK     libbpf.a
    LINK     libbpf.so
  Warning: Num of global symbols in libbpf-in.o (115) does NOT match
  with num of versioned symbols in libbpf.so (114). Please make sure all
  LIBBPF_API symbols are versioned in libbpf.map.
  make: *** [check_abi] Error 1

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26 18:57:14 -08:00
Andrey Ignatov
16192a771d libbpf: Add version script for DSO
More and more projects use libbpf and one day it'll likely be packaged
and distributed as DSO and that requires ABI versioning so that both
compatible and incompatible changes to ABI can be introduced in a safe
way in the future without breaking executables dynamically linked with a
previous version of the library.

Usual way to do ABI versioning is version script for the linker. Add
such a script for libbpf. All global symbols currently exported via
LIBBPF_API macro are added to the version script libbpf.map.

The version name LIBBPF_0.0.1 is constructed from the name of the
library + version specified by $(LIBBPF_VERSION) in Makefile.

Version script does not duplicate the work done by LIBBPF_API macro, it
rather complements it. The macro is used at compile time and can be used
by compiler to do optimization that can't be done at link time, it is
purely about global symbol visibility. The version script, in turn, is
used at link time and takes care of ABI versioning. Both techniques are
described in details in [1].

Whenever ABI is changed in the future, version script should be changed
appropriately.

[1] https://www.akkadia.org/drepper/dsohowto.pdf

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26 18:57:14 -08:00
Martin KaFai Lau
1d2f44ca34 libbpf: Name changing for btf_get_from_id
s/btf_get_from_id/btf__get_from_id/ to restore the API naming convention.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-26 18:57:14 -08:00
Nikita V. Shirokov
47ae7e3d0b libbpf: make bpf_object__open default to UNSPEC
currently by default libbpf's bpf_object__open requires
bpf's program to specify  version in a code because of two things:
1) default prog type is set to KPROBE
2) KPROBE requires (in kernel/bpf/syscall.c) version to be specified

in this patch i'm changing default prog type to UNSPEC and also changing
requirments for version's section to be present in object file.
now it would reflect what we have today in kernel
(only KPROBE prog type requires for version to be explicitly set).

v1 -> v2:
 - RFC tag has been dropped

Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-23 22:27:05 +01:00
Nikita V. Shirokov
addb9fc90f bpf: adding support for map in map in libbpf
idea is pretty simple. for specified map (pointed by struct bpf_map)
we would provide descriptor of already loaded map, which is going to be
used as a prototype for inner map. proposed workflow:
1) open bpf's object (bpf_object__open)
2) create bpf's map which is going to be used as a prototype
3) find (by name) map-in-map which you want to load and update w/
descriptor of inner map w/ a new helper from this patch
4) load bpf program w/ bpf_object__load

Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21 23:33:21 +01:00
Stanislav Fomichev
5b32a23e1d bpf: libbpf: don't specify prog name if kernel doesn't support it
Use recently added capability check.

See commit 23499442c3 ("bpf: libbpf: retry map creation without
the name") for rationale.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21 23:26:14 +01:00
Stanislav Fomichev
94cb310cfa bpf: libbpf: remove map name retry from bpf_create_map_xattr
Instead, check for a newly created caps.name bpf_object capability.
If kernel doesn't support names, don't specify the attribute.

See commit 23499442c3 ("bpf: libbpf: retry map creation without
the name") for rationale.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21 23:26:04 +01:00
Stanislav Fomichev
47eff61777 bpf, libbpf: introduce bpf_object__probe_caps to test BPF capabilities
It currently only checks whether kernel supports map/prog names.
This capability check will be used in the next two commits to
skip setting prog/map names.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21 23:25:33 +01:00
Stanislav Fomichev
8c4905b995 libbpf: make sure bpf headers are c++ include-able
Wrap headers in extern "C", to turn off C++ mangling.
This simplifies including libbpf in c++ and linking against it.

v2 changes:
* do the same for btf.h

v3 changes:
* test_libbpf.cpp to test for possible future c++ breakages

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21 23:15:41 +01:00
Yonghong Song
462c124c59 bpf: fix a libbpf loader issue
Commit 2993e0515b ("tools/bpf: add support to read .BTF.ext sections")
added support to read .BTF.ext sections from an object file, create
and pass prog_btf_fd and func_info to the kernel.

The program btf_fd (prog->btf_fd) is initialized to be -1 to please
zclose so we do not need special handling dur prog close.
Passing -1 to the kernel, however, will cause loading error.
Passing btf_fd 0 to the kernel if prog->btf_fd is invalid
fixed the problem.

Fixes: 2993e0515b ("tools/bpf: add support to read .BTF.ext sections")
Reported-by: Andrey Ignatov <rdna@fb.com>
Reported-by: Emre Cantimur <haydum@fb.com>
Tested-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-21 22:22:17 +01:00
Yonghong Song
d7f5b5e051 tools/bpf: refactor to implement btf_get_from_id() in lib/bpf
The function get_btf() is implemented in tools/bpf/bpftool/map.c
to get a btf structure given a map_info. This patch
refactored this function to be function btf_get_from_id()
in tools/lib/bpf so that it can be used later.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20 10:54:39 -08:00
Yonghong Song
2993e0515b tools/bpf: add support to read .BTF.ext sections
The .BTF section is already available to encode types.
These types can be used for map
pretty print. The whole .BTF will be passed to the
kernel as well for which kernel can verify and return
to the user space for pretty print etc.

The llvm patch at https://reviews.llvm.org/D53736
will generate .BTF section and one more section .BTF.ext.
The .BTF.ext section encodes function type
information and line information. Note that
this patch set only supports function type info.
The functionality is implemented in libbpf.

The .BTF section can be directly loaded into the
kernel, and the .BTF.ext section cannot. The loader
may need to do some relocation and merging,
similar to merging multiple code sections, before
loading into the kernel.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20 10:54:39 -08:00
Yonghong Song
7e0d0fb552 tools/bpf: add new fields for program load in lib/bpf
The new fields are added for program load in lib/bpf so
application uses api bpf_load_program_xattr() is able
to load program with btf and func_info data.

This functionality will be used in next patch
by bpf selftest test_btf.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20 10:54:39 -08:00
Martin KaFai Lau
78a2540e89 tools/bpf: Add tests for BTF_KIND_FUNC_PROTO and BTF_KIND_FUNC
This patch adds unit tests for BTF_KIND_FUNC_PROTO and
BTF_KIND_FUNC to test_btf.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-20 10:54:38 -08:00
Stanislav Fomichev
23499442c3 bpf: libbpf: retry map creation without the name
Since commit 88cda1c9da ("bpf: libbpf: Provide basic API support
to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name
for maps. Pre v4.14 kernels don't know about map names and return an
error about unexpected non-zero data. Retry sys_bpf without a map
name to cover older kernels.

v2 changes:
* check for errno == EINVAL as suggested by Daniel Borkmann

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-20 00:49:32 +01:00
Martin KaFai Lau
a83d6e76a6 bpf: libbpf: Fix bpf_program__next() API
This patch restores the behavior in
commit eac7d84519 ("tools: libbpf: don't return '.text' as a program for multi-function programs")
such that bpf_program__next() does not return pseudo programs in ".text".

Fixes: 0c19a9fbc9 ("libbpf: cleanup after partial failure in bpf_object__pin")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-16 17:46:54 -08:00
Stanislav Fomichev
33a2c75c55 libbpf: add internal pin_name
pin_name is the same as section_name where '/' is replaced
by '_'. bpf_object__pin_programs is converted to use pin_name
to avoid the situation where section_name would require creating another
subdirectory for a pin (as, for example, when calling bpf_object__pin_programs
for programs in sections like "cgroup/connect6").

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:56:11 -08:00
Stanislav Fomichev
fd734c5cca libbpf: bpf_program__pin: add special case for instances.nr == 1
When bpf_program has only one instance, don't create a subdirectory with
per-instance pin files (<prog>/0). Instead, just create a single pin file
for that single instance. This simplifies object pinning by not creating
unnecessary subdirectories.

This can potentially break existing users that depend on the case
where '/0' is always created. However, I couldn't find any serious
usage of bpf_program__pin inside the kernel tree and I suppose there
should be none outside.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:56:10 -08:00
Stanislav Fomichev
0c19a9fbc9 libbpf: cleanup after partial failure in bpf_object__pin
bpftool will use bpf_object__pin in the next commits to pin all programs
and maps from the file; in case of a partial failure, we need to get
back to the clean state (undo previous program/map pins).

As part of a cleanup, I've added and exported separate routines to
pin all maps (bpf_object__pin_maps) and progs (bpf_object__pin_programs)
of an object.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:56:10 -08:00
Andrey Ignatov
3615353218 libbpf: Fix compile error in libbpf_attach_type_by_name
Arnaldo Carvalho de Melo reported build error in libbpf when clang
version 3.8.1-24 (tags/RELEASE_381/final) is used:

libbpf.c:2201:36: error: comparison of constant -22 with expression of
type 'const enum bpf_attach_type' is always false
[-Werror,-Wtautological-constant-out-of-range-compare]
                if (section_names[i].attach_type == -EINVAL)
                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~
1 error generated.

Fix the error by keeping "is_attachable" property of a program in a
separate struct field instead of trying to use attach_type itself.

Fixes: 956b620fcf ("libbpf: Introduce libbpf_attach_type_by_name")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-31 23:06:17 +01:00
Daniel Borkmann
3dca21156b bpf, libbpf: simplify and cleanup perf ring buffer walk
Simplify bpf_perf_event_read_simple() a bit and fix up some minor
things along the way: the return code in the header is not of type
int but enum bpf_perf_event_ret instead. Once callback indicated
to break the loop walking event data, it also needs to be consumed
in data_tail since it has been processed already.

Moreover, bpf_perf_event_print_t callback should avoid void * as
we actually get a pointer to struct perf_event_header and thus
applications can make use of container_of() to have type checks.
The walk also doesn't have to use modulo op since the ring size is
required to be power of two.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-20 23:13:32 -07:00
Daniel Borkmann
a64af0ef1c bpf, libbpf: use correct barriers in perf ring buffer walk
Given libbpf is a generic library and not restricted to x86-64 only,
the compiler barrier in bpf_perf_event_read_simple() after fetching
the head needs to be replaced with smp_rmb() at minimum. Also, writing
out the tail we should use WRITE_ONCE() to avoid store tearing.

Now that we have the logic in place in ring_buffer_read_head() and
ring_buffer_write_tail() helper also used by perf tool which would
select the correct and best variant for a given architecture (e.g.
x86-64 can avoid CPU barriers entirely), make use of these in order
to fix bpf_perf_event_read_simple().

Fixes: d0cabbb021 ("tools: bpf: move the event reading loop to libbpf")
Fixes: 39111695b1 ("samples: bpf: add bpf_perf_event_output example")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-19 13:43:08 -07:00