Commit Graph

19425 Commits

Author SHA1 Message Date
Andrii Nakryiko
0b163565b9 selftests/bpf: Add field size relocation tests
Add test verifying correctness and logic of field size relocation support in
libbpf.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-6-andriin@fb.com
2019-11-04 16:06:56 +01:00
Andrii Nakryiko
8b1cb1c960 selftest/bpf: Add relocatable bitfield reading tests
Add a bunch of selftests verifying correctness of relocatable bitfield reading
support in libbpf. Both bpf_probe_read()-based and direct read-based bitfield
macros are tested. core_reloc.c "test_harness" is extended to support raw
tracepoint and new typed raw tracepoints as test BPF program types.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-5-andriin@fb.com
2019-11-04 16:06:56 +01:00
Andrii Nakryiko
94f060e984 libbpf: Add support for field size relocations
Add bpf_core_field_size() macro, capturing a relocation against field size.
Adjust bits of internal libbpf relocation logic to allow capturing size
relocations of various field types: arrays, structs/unions, enums, etc.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-4-andriin@fb.com
2019-11-04 16:06:56 +01:00
Andrii Nakryiko
ee26dade0e libbpf: Add support for relocatable bitfields
Add support for the new field relocation kinds, necessary to support
relocatable bitfield reads. Provide macro for abstracting necessary code doing
full relocatable bitfield extraction into u64 value. Two separate macros are
provided:
- BPF_CORE_READ_BITFIELD macro for direct memory read-enabled BPF programs
(e.g., typed raw tracepoints). It uses direct memory dereference to extract
bitfield backing integer value.
- BPF_CORE_READ_BITFIELD_PROBED macro for cases where bpf_probe_read() needs
to be used to extract same backing integer value.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-3-andriin@fb.com
2019-11-04 16:06:56 +01:00
Andrii Nakryiko
42765ede5c selftests/bpf: Remove too strict field offset relo test cases
As libbpf is going to gain support for more field relocations, including field
size, some restrictions about exact size match are going to be lifted. Remove
test cases that explicitly test such failures.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191101222810.1246166-2-andriin@fb.com
2019-11-04 16:06:56 +01:00
David S. Miller
ae8a76fb8b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-11-02

The following pull-request contains BPF updates for your *net-next* tree.

We've added 30 non-merge commits during the last 7 day(s) which contain
a total of 41 files changed, 1864 insertions(+), 474 deletions(-).

The main changes are:

1) Fix long standing user vs kernel access issue by introducing
   bpf_probe_read_user() and bpf_probe_read_kernel() helpers, from Daniel.

2) Accelerated xskmap lookup, from Björn and Maciej.

3) Support for automatic map pinning in libbpf, from Toke.

4) Cleanup of BTF-enabled raw tracepoints, from Alexei.

5) Various fixes to libbpf and selftests.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-02 15:29:58 -07:00
David S. Miller
d31e95585c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.

The rest were (relatively) trivial in nature.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-02 13:54:56 -07:00
Daniel Borkmann
fa553d9b57 bpf, testing: Add selftest to read/write sockaddr from user space
Tested on x86-64 and Ilya was also kind enough to give it a spin on
s390x, both passing with probe_user:OK there. The test is using the
newly added bpf_probe_read_user() to dump sockaddr from connect call
into .bss BPF map and overrides the user buffer via bpf_probe_write_user():

  # ./test_progs
  [...]
  #17 pkt_md_access:OK
  #18 probe_user:OK
  #19 prog_run_xattr:OK
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/90f449d8af25354e05080e82fc6e2d3179da30ea.1572649915.git.daniel@iogearbox.net
2019-11-02 12:45:08 -07:00
Daniel Borkmann
50f9aa44ca bpf, testing: Convert prog tests to probe_read_{user, kernel}{, _str} helper
Use probe read *_{kernel,user}{,_str}() helpers instead of bpf_probe_read()
or bpf_probe_read_user_str() for program tests where appropriate.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/4a61d4b71ce3765587d8ef5cb93afa18515e5b3e.1572649915.git.daniel@iogearbox.net
2019-11-02 12:39:13 -07:00
Daniel Borkmann
6ae08ae3de bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers
The current bpf_probe_read() and bpf_probe_read_str() helpers are broken
in that they assume they can be used for probing memory access for kernel
space addresses /as well as/ user space addresses.

However, plain use of probe_kernel_read() for both cases will attempt to
always access kernel space address space given access is performed under
KERNEL_DS and some archs in-fact have overlapping address spaces where a
kernel pointer and user pointer would have the /same/ address value and
therefore accessing application memory via bpf_probe_read{,_str}() would
read garbage values.

Lets fix BPF side by making use of recently added 3d7081822f ("uaccess:
Add non-pagefault user-space read functions"). Unfortunately, the only way
to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}()
and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}()
helpers are kept as-is to retain their current behavior.

The two *_user() variants attempt the access always under USER_DS set, the
two *_kernel() variants will -EFAULT when accessing user memory if the
underlying architecture has non-overlapping address ranges, also avoiding
throwing the kernel warning via 00c42373d3 ("x86-64: add warning for
non-canonical user access address dereferences").

Fixes: a5e8c07059 ("bpf: add bpf_probe_read_str helper")
Fixes: 2541517c32 ("tracing, perf: Implement BPF programs attached to kprobes")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net
2019-11-02 12:39:12 -07:00
Toke Høiland-Jørgensen
2f4a32cc83 selftests: Add tests for automatic map pinning
This adds a new BPF selftest to exercise the new automatic map pinning
code.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269298209.394725.15420085139296213182.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen
57a00f4164 libbpf: Add auto-pinning of maps when loading BPF objects
This adds support to libbpf for setting map pinning information as part of
the BTF map declaration, to get automatic map pinning (and reuse) on load.
The pinning type currently only supports a single PIN_BY_NAME mode, where
each map will be pinned by its name in a path that can be overridden, but
defaults to /sys/fs/bpf.

Since auto-pinning only does something if any maps actually have a
'pinning' BTF attribute set, we default the new option to enabled, on the
assumption that seamless pinning is what most callers want.

When a map has a pin_path set at load time, libbpf will compare the map
pinned at that location (if any), and if the attributes match, will re-use
that map instead of creating a new one. If no existing map is found, the
newly created map will instead be pinned at the location.

Programs wanting to customise the pinning can override the pinning paths
using bpf_map__set_pin_path() before calling bpf_object__load() (including
setting it to NULL to disable pinning of a particular map).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269298092.394725.3966306029218559681.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen
196f8487f5 libbpf: Move directory creation into _pin() functions
The existing pin_*() functions all try to create the parent directory
before pinning. Move this check into the per-object _pin() functions
instead. This ensures consistent behaviour when auto-pinning is
added (which doesn't go through the top-level pin_maps() function), at the
cost of a few more calls to mkdir().

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297985.394725.5882630952992598610.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen
4580b25fce libbpf: Store map pin path and status in struct bpf_map
Support storing and setting a pin path in struct bpf_map, which can be used
for automatic pinning. Also store the pin status so we can avoid attempts
to re-pin a map that has already been pinned (or reused from a previous
pinning).

The behaviour of bpf_object__{un,}pin_maps() is changed so that if it is
called with a NULL path argument (which was previously illegal), it will
(un)pin only those maps that have a pin_path set.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297876.394725.14782206533681896279.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen
d1b4574a4b libbpf: Fix error handling in bpf_map__reuse_fd()
bpf_map__reuse_fd() was calling close() in the error path before returning
an error value based on errno. However, close can change errno, so that can
lead to potentially misleading error messages. Instead, explicitly store
errno in the err variable before each goto.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297769.394725.12634985106772698611.stgit@toke.dk
2019-11-02 12:35:06 -07:00
Linus Torvalds
1204c70d9d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix free/alloc races in batmanadv, from Sven Eckelmann.

 2) Several leaks and other fixes in kTLS support of mlx5 driver, from
    Tariq Toukan.

 3) BPF devmap_hash cost calculation can overflow on 32-bit, from Toke
    Høiland-Jørgensen.

 4) Add an r8152 device ID, from Kazutoshi Noguchi.

 5) Missing include in ipv6's addrconf.c, from Ben Dooks.

 6) Use siphash in flow dissector, from Eric Dumazet. Attackers can
    easily infer the 32-bit secret otherwise etc.

 7) Several netdevice nesting depth fixes from Taehee Yoo.

 8) Fix several KCSAN reported errors, from Eric Dumazet. For example,
    when doing lockless skb_queue_empty() checks, and accessing
    sk_napi_id/sk_incoming_cpu lockless as well.

 9) Fix jumbo packet handling in RXRPC, from David Howells.

10) Bump SOMAXCONN and tcp_max_syn_backlog values, from Eric Dumazet.

11) Fix DMA synchronization in gve driver, from Yangchun Fu.

12) Several bpf offload fixes, from Jakub Kicinski.

13) Fix sk_page_frag() recursion during memory reclaim, from Tejun Heo.

14) Fix ping latency during high traffic rates in hisilicon driver, from
    Jiangfent Xiao.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (146 commits)
  net: fix installing orphaned programs
  net: cls_bpf: fix NULL deref on offload filter removal
  selftests: bpf: Skip write only files in debugfs
  selftests: net: reuseport_dualstack: fix uninitalized parameter
  r8169: fix wrong PHY ID issue with RTL8168dp
  net: dsa: bcm_sf2: Fix IMP setup for port different than 8
  net: phylink: Fix phylink_dbg() macro
  gve: Fixes DMA synchronization.
  inet: stop leaking jiffies on the wire
  ixgbe: Remove duplicate clear_bit() call
  Documentation: networking: device drivers: Remove stray asterisks
  e1000: fix memory leaks
  i40e: Fix receive buffer starvation for AF_XDP
  igb: Fix constant media auto sense switching when no cable is connected
  net: ethernet: arc: add the missed clk_disable_unprepare
  igb: Enable media autosense for the i350.
  igb/igc: Don't warn on fatal read failures when the device is removed
  tcp: increase tcp_max_syn_backlog max value
  net: increase SOMAXCONN to 4096
  netdevsim: Fix use-after-free during device dismantle
  ...
2019-11-01 17:48:11 -07:00
Jakub Kicinski
8101e06941 selftests: bpf: Skip write only files in debugfs
DebugFS for netdevsim now contains some "action trigger" files
which are write only. Don't try to capture the contents of those.

Note that we can't use os.access() because the script requires
root.

Fixes: 4418f862d6 ("netdevsim: implement support for devlink region and snapshots")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01 15:16:01 -07:00
Wei Wang
d64479a3e3 selftests: net: reuseport_dualstack: fix uninitalized parameter
This test reports EINVAL for getsockopt(SOL_SOCKET, SO_DOMAIN)
occasionally due to the uninitialized length parameter.
Initialize it to fix this, and also use int for "test_family" to comply
with the API standard.

Fixes: d6a61f80b8 ("soreuseport: test mixed v4/v6 sockets")
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Cc: Craig Gallek <cgallek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01 15:11:02 -07:00
Roman Mashak
c23fcbbc6a tc-testing: added tests with cookie for conntrack TC action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01 14:53:41 -07:00
Jakub Kicinski
75b0bfd2e1 Revert "selftests: bpf: Don't try to read files without read permission"
This reverts commit 5bc60de50d ("selftests: bpf: Don't try to read
files without read permission").

Quoted commit does not work at all, and was never tested.
Script requires root permissions (and tests for them)
and os.access() will always return true for root.

The correct fix is needed in the bpf tree, so let's just
revert and save ourselves the merge conflict.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Pirko <jiri@resnulli.us>
Link: https://lore.kernel.org/bpf/20191101005127.1355-1-jakub.kicinski@netronome.com
2019-11-01 13:13:21 +01:00
Alexei Starovoitov
12a8654b2e libbpf: Add support for prog_tracing
Cleanup libbpf from expected_attach_type == attach_btf_id hack
and introduce BPF_PROG_TYPE_TRACING.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191030223212.953010-3-ast@kernel.org
2019-10-31 15:16:59 +01:00
Vlad Buslov
9ae6b78708 tc-testing: implement tests for new fast_init action flag
Add basic tests to verify action creation with new fast_init flag for all
actions that support the flag.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-30 18:07:51 -07:00
Roman Mashak
c4917bfc3a tc-testing: fixed two failing pedit tests
Two pedit tests were failing due to incorrect operation
value in matchPattern, should be 'add' not 'val', so fix it.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-30 12:38:08 -07:00
Ilya Leoshkevich
9ffccb7606 selftests/bpf: Test narrow load from bpf_sysctl.write
There are tests for full and narrows loads from bpf_sysctl.file_pos, but
for bpf_sysctl.write only full load is tested. Add the missing test.

Suggested-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20191029143027.28681-1-iii@linux.ibm.com
2019-10-30 16:24:06 +01:00
Andrii Nakryiko
a566e35f1e libbpf: Don't use kernel-side u32 type in xsk.c
u32 is a kernel-side typedef. User-space library is supposed to use __u32.
This breaks Github's projection of libbpf. Do u32 -> __u32 fix.

Fixes: 94ff9ebb49 ("libbpf: Fix compatibility for kernels without need_wakeup")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20191029055953.2461336-1-andriin@fb.com
2019-10-29 06:38:12 -07:00
Andrii Nakryiko
d3a3aa0c59 libbpf: Fix off-by-one error in ELF sanity check
libbpf's bpf_object__elf_collect() does simple sanity check after iterating
over all ELF sections, if checks that .strtab index is correct. Unfortunately,
due to section indices being 1-based, the check breaks for cases when .strtab
ends up being the very last section in ELF.

Fixes: 77ba9a5b48 ("tools lib bpf: Fetch map names from correct strtab")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191028233727.1286699-1-andriin@fb.com
2019-10-28 20:27:40 -07:00
Magnus Karlsson
94ff9ebb49 libbpf: Fix compatibility for kernels without need_wakeup
When the need_wakeup flag was added to AF_XDP, the format of the
XDP_MMAP_OFFSETS getsockopt was extended. Code was added to the
kernel to take care of compatibility issues arrising from running
applications using any of the two formats. However, libbpf was
not extended to take care of the case when the application/libbpf
uses the new format but the kernel only supports the old
format. This patch adds support in libbpf for parsing the old
format, before the need_wakeup flag was added, and emulating a
set of static need_wakeup flags that will always work for the
application.

v2 -> v3:
* Incorporated code improvements suggested by Jonathan Lemon

v1 -> v2:
* Rebased to bpf-next
* Rewrote the code as the previous version made you blind

Fixes: a4500432c2 ("libbpf: add support for need_wakeup flag in AF_XDP part")
Reported-by: Eloy Degen <degeneloy@gmail.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1571995035-21889-1-git-send-email-magnus.karlsson@intel.com
2019-10-28 20:25:32 -07:00
Ilya Leoshkevich
e93d99180a selftests/bpf: Restore $(OUTPUT)/test_stub.o rule
`make O=/linux-build kselftest TARGETS=bpf` fails with

	make[3]: *** No rule to make target '/linux-build/bpf/test_stub.o', needed by '/linux-build/bpf/test_verifier'

The same command without the O= part works, presumably thanks to the
implicit rule.

Fix by restoring the explicit $(OUTPUT)/test_stub.o rule.

Fixes: 74b5a5968f ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191028102110.7545-1-iii@linux.ibm.com
2019-10-28 16:16:10 +01:00
Ilya Leoshkevich
313e7f6fb1 selftest/bpf: Use -m{little, big}-endian for clang
When cross-compiling tests from x86 to s390, the resulting BPF objects
fail to load due to endianness mismatch.

Fix by using BPF-GCC endianness check for clang as well.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191028102049.7489-1-iii@linux.ibm.com
2019-10-28 16:15:51 +01:00
Linus Torvalds
a8a31fdcca Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A set of perf fixes:

  kernel:

   - Unbreak the tracking of auxiliary buffer allocations which got
     imbalanced causing recource limit failures.

   - Fix the fallout of splitting of ToPA entries which missed to shift
     the base entry PA correctly.

   - Use the correct context to lookup the AUX event when unmapping the
     associated AUX buffer so the event can be stopped and the buffer
     reference dropped.

  tools:

   - Fix buildiid-cache mode setting in copyfile_mode_ns() when copying
     /proc/kcore

   - Fix freeing id arrays in the event list so the correct event is
     closed.

   - Sync sched.h anc kvm.h headers with the kernel sources.

   - Link jvmti against tools/lib/ctype.o to have weak strlcpy().

   - Fix multiple memory and file descriptor leaks, found by coverity in
     perf annotate.

   - Fix leaks in error handling paths in 'perf c2c', 'perf kmem', found
     by a static analysis tool"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/aux: Fix AUX output stopping
  perf/aux: Fix tracking of auxiliary trace buffer allocation
  perf/x86/intel/pt: Fix base for single entry topa
  perf kmem: Fix memory leak in compact_gfp_flags()
  tools headers UAPI: Sync sched.h with the kernel
  tools headers kvm: Sync kvm.h headers with the kernel sources
  tools headers kvm: Sync kvm headers with the kernel sources
  tools headers kvm: Sync kvm headers with the kernel sources
  perf c2c: Fix memory leak in build_cl_output()
  perf tools: Fix mode setting in copyfile_mode_ns()
  perf annotate: Fix multiple memory and file descriptor leaks
  perf tools: Fix resource leak of closedir() on the error paths
  perf evlist: Fix fix for freed id arrays
  perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()
2019-10-27 06:59:34 -04:00
David S. Miller
5b7fe93db0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-10-27

The following pull-request contains BPF updates for your *net-next* tree.

We've added 52 non-merge commits during the last 11 day(s) which contain
a total of 65 files changed, 2604 insertions(+), 1100 deletions(-).

The main changes are:

 1) Revolutionize BPF tracing by using in-kernel BTF to type check BPF
    assembly code. The work here teaches BPF verifier to recognize
    kfree_skb()'s first argument as 'struct sk_buff *' in tracepoints
    such that verifier allows direct use of bpf_skb_event_output() helper
    used in tc BPF et al (w/o probing memory access) that dumps skb data
    into perf ring buffer. Also add direct loads to probe memory in order
    to speed up/replace bpf_probe_read() calls, from Alexei Starovoitov.

 2) Big batch of changes to improve libbpf and BPF kselftests. Besides
    others: generalization of libbpf's CO-RE relocation support to now
    also include field existence relocations, revamp the BPF kselftest
    Makefile to add test runner concept allowing to exercise various
    ways to build BPF programs, and teach bpf_object__open() and friends
    to automatically derive BPF program type/expected attach type from
    section names to ease their use, from Andrii Nakryiko.

 3) Fix deadlock in stackmap's build-id lookup on rq_lock(), from Song Liu.

 4) Allow to read BTF as raw data from bpftool. Most notable use case
    is to dump /sys/kernel/btf/vmlinux through this, from Jiri Olsa.

 5) Use bpf_redirect_map() helper in libbpf's AF_XDP helper prog which
    manages to improve "rx_drop" performance by ~4%., from Björn Töpel.

 6) Fix to restore the flow dissector after reattach BPF test and also
    fix error handling in bpf_helper_defs.h generation, from Jakub Sitnicki.

 7) Improve verifier's BTF ctx access for use outside of raw_tp, from
    Martin KaFai Lau.

 8) Improve documentation for AF_XDP with new sections and to reflect
    latest features, from Magnus Karlsson.

 9) Add back 'version' section parsing to libbpf for old kernels, from
    John Fastabend.

10) Fix strncat bounds error in libbpf's libbpf_prog_type_by_name(),
    from KP Singh.

11) Turn on -mattr=+alu32 in LLVM by default for BPF kselftests in order
    to improve insn coverage for built BPF progs, from Yonghong Song.

12) Misc minor cleanups and fixes, from various others.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26 22:57:27 -07:00
David S. Miller
1a51a47491 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-10-27

The following pull-request contains BPF updates for your *net* tree.

We've added 7 non-merge commits during the last 11 day(s) which contain
a total of 7 files changed, 66 insertions(+), 16 deletions(-).

The main changes are:

1) Fix two use-after-free bugs in relation to RCU in jited symbol exposure to
   kallsyms, from Daniel Borkmann.

2) Fix NULL pointer dereference in AF_XDP rx-only sockets, from Magnus Karlsson.

3) Fix hang in netdev unregister for hash based devmap as well as another overflow
   bug on 32 bit archs in memlock cost calculation, from Toke Høiland-Jørgensen.

4) Fix wrong memory access in LWT BPF programs on reroute due to invalid dst.
   Also fix BPF selftests to use more compatible nc options, from Jiri Benc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26 18:30:55 -07:00
Roman Mashak
b951248518 tc-testing: list required kernel options for act_ct action
Updated config with required kernel options for conntrac TC action,
so that tdc can run the tests.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26 11:40:22 -07:00
David S. Miller
4b1f5ddaff Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
more specifically:

* Updates for ipset:

1) Coding style fix for ipset comment extension, from Jeremy Sowden.

2) De-inline many functions in ipset, from Jeremy Sowden.

3) Move ipset function definition from header to source file.

4) Move ip_set_put_flags() to source, export it as a symbol, remove
   inline.

5) Move range_to_mask() to the source file where this is used.

6) Move ip_set_get_ip_port() to the source file where this is used.

* IPVS selftests and netns improvements:

7) Two patches to speedup ipvs netns dismantle, from Haishuang Yan.

8) Three patches to add selftest script for ipvs, also from
   Haishuang Yan.

* Conntrack updates and new nf_hook_slow_list() function:

9) Document ct ecache extension, from Florian Westphal.

10) Skip ct extensions from ctnetlink dump, from Florian.

11) Free ct extension immediately, from Florian.

12) Skip access to ecache extension from nf_ct_deliver_cached_events()
    this is not correct as reported by Syzbot.

13) Add and use nf_hook_slow_list(), from Florian.

* Flowtable infrastructure updates:

14) Move priority to nf_flowtable definition.

15) Dynamic allocation of per-device hooks in flowtables.

16) Allow to include netdevice only once in flowtable definitions.

17) Rise maximum number of devices per flowtable.

* Netfilter hardware offload infrastructure updates:

18) Add nft_flow_block_chain() helper function.

19) Pass callback list to nft_setup_cb_call().

20) Add nft_flow_cls_offload_setup() helper function.

21) Remove rules for the unregistered device via netdevice event.

22) Support for multiple devices in a basechain definition at the
    ingress hook.

22) Add nft_chain_offload_cmd() helper function.

23) Add nft_flow_block_offload_init() helper function.

24) Rewind in case of failing to bind multiple devices to hook.

25) Typo in IPv6 tproxy module description, from Norman Rasmussen.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26 11:35:43 -07:00
Paolo Abeni
37de3b3541 selftests: fib_tests: add more tests for metric update
This patch adds two more tests to ipv4_addr_metric_test() to
explicitly cover the scenarios fixed by the previous patch.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26 11:25:53 -07:00
Andrii Nakryiko
027cbaaf61 selftests/bpf: Fix .gitignore to ignore no_alu32/
When switching to alu32 by default, no_alu32/ subdirectory wasn't added
to .gitignore. Fix it.

Fixes: e13a2fe642 ("tools/bpf: Turn on llvm alu32 attribute by default")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191025045503.3043427-1-andriin@fb.com
2019-10-25 23:41:22 +02:00
Jiri Olsa
a943646036 bpftool: Allow to read btf as raw data
The bpftool interface stays the same, but now it's possible
to run it over BTF raw data, like:

  $ bpftool btf dump file /sys/kernel/btf/vmlinux
  [1] INT '(anon)' size=4 bits_offset=0 nr_bits=32 encoding=(none)
  [2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none)
  [3] CONST '(anon)' type_id=2

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Link: https://lore.kernel.org/bpf/20191024133025.10691-1-jolsa@kernel.org
2019-10-25 23:34:47 +02:00
KP Singh
58eeb2289a libbpf: Fix strncat bounds error in libbpf_prog_type_by_name
On compiling samples with this change, one gets an error:

 error: ‘strncat’ specified bound 118 equals destination size
  [-Werror=stringop-truncation]

    strncat(dst, name + section_names[i].len,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name));
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

strncat requires the destination to have enough space for the
terminating null byte.

Fixes: f75a697e09 ("libbpf: Auto-detect btf_id of BTF-based raw_tracepoint")
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191023154038.24075-1-kpsingh@chromium.org
2019-10-23 10:17:28 -07:00
Andrii Nakryiko
45e587b5e8 selftests/bpf: Fix LDLIBS order
Order of $(LDLIBS) matters to linker, so put it after all the .o and .a
files.

Fixes: 74b5a5968f ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191023153128.3486140-1-andriin@fb.com
2019-10-23 10:09:48 -07:00
Andrii Nakryiko
9bc6384b36 selftests/bpf: Move test_section_names into test_progs and fix it
Make test_section_names into test_progs test. Also fix ESRCH expected
results. Add uprobe/uretprobe and tp/raw_tp test cases.

Fixes: dd4436bb83 ("libbpf: Teach bpf_object__open to guess program types")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191023060913.1713817-1-andriin@fb.com
2019-10-23 10:06:46 -07:00
Björn Töpel
d7d962a095 libbpf: Use implicit XSKMAP lookup from AF_XDP XDP program
In commit 43e74c0267 ("bpf_xdp_redirect_map: Perform map lookup in
eBPF helper") the bpf_redirect_map() helper learned to do map lookup,
which means that the explicit lookup in the XDP program for AF_XDP is
not needed for post-5.3 kernels.

This commit adds the implicit map lookup with default action, which
improves the performance for the "rx_drop" [1] scenario with ~4%.

For pre-5.3 kernels, the bpf_redirect_map() returns XDP_ABORTED, and a
fallback path for backward compatibility is entered, where explicit
lookup is still performed. This means a slight regression for older
kernels (an additional bpf_redirect_map() call), but I consider that a
fair punishment for users not upgrading their kernels. ;-)

v1->v2: Backward compatibility (Toke) [2]
v2->v3: Avoid masking/zero-extension by using JMP32 [3]

[1] # xdpsock -i eth0 -z -r
[2] https://lore.kernel.org/bpf/87pnirb3dc.fsf@toke.dk/
[3] https://lore.kernel.org/bpf/87v9sip0i8.fsf@toke.dk/

Suggested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191022072206.6318-1-bjorn.topel@gmail.com
2019-10-23 10:03:52 -07:00
David Ahern
b5b9181c24 selftests: Make l2tp.sh executable
Kernel test robot reported that the l2tp.sh test script failed:
    # selftests: net: l2tp.sh
    # Warning: file l2tp.sh is not executable, correct this.

Set executable bits.

Fixes: e858ef1cd4 ("selftests: Add l2tp tests")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22 14:01:35 -07:00
Andrii Nakryiko
e00aca65e6 libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declaration
LIBBPF_OPTS is implemented as a mix of field declaration and memset
+ assignment. This makes it neither variable declaration nor purely
statements, which is a problem, because you can't mix it with either
other variable declarations nor other function statements, because C90
compiler mode emits warning on mixing all that together.

This patch changes LIBBPF_OPTS into a strictly declaration of variable
and solves this problem, as can be seen in case of bpftool, which
previously would emit compiler warning, if done this way (LIBBPF_OPTS as
part of function variables declaration block).

This patch also renames LIBBPF_OPTS into DECLARE_LIBBPF_OPTS to follow
kernel convention for similar macros more closely.

v1->v2:
- rename LIBBPF_OPTS into DECLARE_LIBBPF_OPTS (Jakub Sitnicki).

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20191022172100.3281465-1-andriin@fb.com
2019-10-22 21:35:03 +02:00
Yonghong Song
e13a2fe642 tools/bpf: Turn on llvm alu32 attribute by default
LLVM alu32 was introduced in LLVM7:

  https://reviews.llvm.org/rL325987
  https://reviews.llvm.org/rL325989

Experiments showed that in general performance is better with alu32
enabled:

  https://lwn.net/Articles/775316/

This patch turns on alu32 with no-flavor test_progs which is tested
most often. The flavor test at no_alu32/test_progs can be used to
test without alu32 enabled. The Makefile check for whether LLVM
supports '-mattr=+alu32 -mcpu=v3' is removed as LLVM7 should be
available for recent distributions and also latest LLVM is preferred
to run BPF selftests.

Note that jmp32 is checked by -mcpu=probe and will be enabled if the
host kernel supports it.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191022043119.2625263-1-yhs@fb.com
2019-10-22 21:13:39 +02:00
Vitaly Kuznetsov
ef40598098 selftests: kvm: fix sync_regs_test with newer gccs
Commit 204c91eff7 ("KVM: selftests: do not blindly clobber registers in
 guest asm") was intended to make test more gcc-proof, however, the result
is exactly the opposite: on newer gccs (e.g. 8.2.1) the test breaks with

==== Test Assertion Failure ====
  x86_64/sync_regs_test.c:168: run->s.regs.regs.rbx == 0xBAD1DEA + 1
  pid=14170 tid=14170 - Invalid argument
     1	0x00000000004015b3: main at sync_regs_test.c:166 (discriminator 6)
     2	0x00007f413fb66412: ?? ??:0
     3	0x000000000040191d: _start at ??:?
  rbx sync regs value incorrect 0x1.

Apparently, compile is still free to play games with registers even
when they have variables attached.

Re-write guest code with 'asm volatile' by embedding ucall there and
making sure rbx is preserved.

Fixes: 204c91eff7 ("KVM: selftests: do not blindly clobber registers in guest asm")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22 13:31:18 +02:00
Vitaly Kuznetsov
11eada4718 selftests: kvm: vmx_dirty_log_test: skip the test when VMX is not supported
vmx_dirty_log_test fails on AMD and this is no surprise as it is VMX
specific. Bail early when nested VMX is unsupported.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22 13:31:17 +02:00
Vitaly Kuznetsov
9143613ef0 selftests: kvm: consolidate VMX support checks
vmx_* tests require VMX and three of them implement the same check. Move it
to vmx library.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22 13:31:16 +02:00
Vitaly Kuznetsov
700c17d9ce selftests: kvm: vmx_set_nested_state_test: don't check for VMX support twice
vmx_set_nested_state_test() checks if VMX is supported twice: in the very
beginning (and skips the whole test if it's not) and before doing
test_vmx_nested_state(). One should be enough.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22 13:31:16 +02:00
Vitaly Kuznetsov
9de25d182b selftests: kvm: synchronize .gitignore to Makefile
Because "Untracked files:" are annoying.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22 13:31:13 +02:00
Roman Mashak
a8fad5459d tc-testing: updated pedit TDC tests
Added test cases for IP header operations:
- set tos/precedence
- add value to tos/precedence
- clear tos/precedence
- invert tos/precedence

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-21 10:38:51 -07:00