2016-11-12 01:55:11 +07:00
|
|
|
/*
|
|
|
|
* Copyright (c) 2016 Facebook
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
* modify it under the terms of version 2 of the GNU General Public
|
|
|
|
* License as published by the Free Software Foundation.
|
|
|
|
*/
|
|
|
|
#define _GNU_SOURCE
|
|
|
|
#include <stdio.h>
|
|
|
|
#include <unistd.h>
|
|
|
|
#include <errno.h>
|
|
|
|
#include <string.h>
|
|
|
|
#include <assert.h>
|
|
|
|
#include <sched.h>
|
|
|
|
#include <stdlib.h>
|
|
|
|
#include <time.h>
|
bpf: fix multiple issues in selftest suite and samples
1) The test_lru_map and test_lru_dist fails building on my machine since
the sys/resource.h header is not included.
2) test_verifier fails in one test case where we try to call an invalid
function, since the verifier log output changed wrt printing function
names.
3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for
retrieving the number of possible CPUs. This is broken at least in our
scenario and really just doesn't work.
glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF.
First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l,
if that fails, depending on the config, it either tries to count CPUs
in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead.
If /proc/cpuinfo has some issue, it returns just 1 worst case. This
oddity is nothing new [1], but semantics/behaviour seems to be settled.
_SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if
that fails it looks into /proc/stat for cpuX entries, and if also that
fails for some reason, /proc/cpuinfo is consulted (and returning 1 if
unlikely all breaks down).
While that might match num_possible_cpus() from the kernel in some
cases, it's really not guaranteed with CPU hotplugging, and can result
in a buffer overflow since the array in user space could have too few
number of slots, and on perpcu map lookup, the kernel will write beyond
that memory of the value buffer.
William Tu reported such mismatches:
[...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu()
happens when CPU hotadd is enabled. For example, in Fusion when
setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64
-smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf()
will be 2 [2]. [...]
Documentation/cputopology.txt says /sys/devices/system/cpu/possible
outputs cpu_possible_mask. That is the same as in num_possible_cpus(),
so first step would be to fix the _SC_NPROCESSORS_CONF calls with our
own implementation. Later, we could add support to bpf(2) for passing
a mask via CPU_SET(3), for example, to just select a subset of CPUs.
BPF samples code needs this fix as well (at least so that people stop
copying this). Thus, define bpf_num_possible_cpus() once in selftests
and import it from there for the sample code to avoid duplicating it.
The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated.
After all three issues are fixed, the test suite runs fine again:
# make run_tests | grep self
selftests: test_verifier [PASS]
selftests: test_maps [PASS]
selftests: test_lru_map [PASS]
selftests: test_kmod.sh [PASS]
[1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html
[2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html
Fixes: 3059303f59cf ("samples/bpf: update tracex[23] examples to use per-cpu maps")
Fixes: 86af8b4191d2 ("Add sample for adding simple drop program to link")
Fixes: df570f577231 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY")
Fixes: e15596717948 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH")
Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id")
Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: William Tu <u9012063@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-26 07:28:09 +07:00
|
|
|
|
|
|
|
#include <sys/wait.h>
|
|
|
|
#include <sys/resource.h>
|
|
|
|
|
2017-02-10 06:21:39 +07:00
|
|
|
#include <bpf/bpf.h>
|
2016-11-12 01:55:11 +07:00
|
|
|
#include "bpf_sys.h"
|
bpf: fix multiple issues in selftest suite and samples
1) The test_lru_map and test_lru_dist fails building on my machine since
the sys/resource.h header is not included.
2) test_verifier fails in one test case where we try to call an invalid
function, since the verifier log output changed wrt printing function
names.
3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for
retrieving the number of possible CPUs. This is broken at least in our
scenario and really just doesn't work.
glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF.
First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l,
if that fails, depending on the config, it either tries to count CPUs
in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead.
If /proc/cpuinfo has some issue, it returns just 1 worst case. This
oddity is nothing new [1], but semantics/behaviour seems to be settled.
_SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if
that fails it looks into /proc/stat for cpuX entries, and if also that
fails for some reason, /proc/cpuinfo is consulted (and returning 1 if
unlikely all breaks down).
While that might match num_possible_cpus() from the kernel in some
cases, it's really not guaranteed with CPU hotplugging, and can result
in a buffer overflow since the array in user space could have too few
number of slots, and on perpcu map lookup, the kernel will write beyond
that memory of the value buffer.
William Tu reported such mismatches:
[...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu()
happens when CPU hotadd is enabled. For example, in Fusion when
setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64
-smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf()
will be 2 [2]. [...]
Documentation/cputopology.txt says /sys/devices/system/cpu/possible
outputs cpu_possible_mask. That is the same as in num_possible_cpus(),
so first step would be to fix the _SC_NPROCESSORS_CONF calls with our
own implementation. Later, we could add support to bpf(2) for passing
a mask via CPU_SET(3), for example, to just select a subset of CPUs.
BPF samples code needs this fix as well (at least so that people stop
copying this). Thus, define bpf_num_possible_cpus() once in selftests
and import it from there for the sample code to avoid duplicating it.
The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated.
After all three issues are fixed, the test suite runs fine again:
# make run_tests | grep self
selftests: test_verifier [PASS]
selftests: test_maps [PASS]
selftests: test_lru_map [PASS]
selftests: test_kmod.sh [PASS]
[1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html
[2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html
Fixes: 3059303f59cf ("samples/bpf: update tracex[23] examples to use per-cpu maps")
Fixes: 86af8b4191d2 ("Add sample for adding simple drop program to link")
Fixes: df570f577231 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY")
Fixes: e15596717948 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH")
Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id")
Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: William Tu <u9012063@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-26 07:28:09 +07:00
|
|
|
#include "bpf_util.h"
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
#define LOCAL_FREE_TARGET (128)
|
|
|
|
#define PERCPU_FREE_TARGET (16)
|
|
|
|
|
|
|
|
static int nr_cpus;
|
|
|
|
|
|
|
|
static int create_map(int map_type, int map_flags, unsigned int size)
|
|
|
|
{
|
|
|
|
int map_fd;
|
|
|
|
|
2017-02-10 06:21:43 +07:00
|
|
|
map_fd = bpf_create_map(map_type, sizeof(unsigned long long),
|
2016-11-12 01:55:11 +07:00
|
|
|
sizeof(unsigned long long), size, map_flags);
|
|
|
|
|
|
|
|
if (map_fd == -1)
|
2017-02-10 06:21:43 +07:00
|
|
|
perror("bpf_create_map");
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
return map_fd;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int map_subset(int map0, int map1)
|
|
|
|
{
|
|
|
|
unsigned long long next_key = 0;
|
|
|
|
unsigned long long value0[nr_cpus], value1[nr_cpus];
|
|
|
|
int ret;
|
|
|
|
|
2017-02-10 06:21:42 +07:00
|
|
|
while (!bpf_map_get_next_key(map1, &next_key, &next_key)) {
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(!bpf_map_lookup_elem(map1, &next_key, value1));
|
|
|
|
ret = bpf_map_lookup_elem(map0, &next_key, value0);
|
2016-11-12 01:55:11 +07:00
|
|
|
if (ret) {
|
|
|
|
printf("key:%llu not found from map. %s(%d)\n",
|
|
|
|
next_key, strerror(errno), errno);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
if (value0[0] != value1[0]) {
|
|
|
|
printf("key:%llu value0:%llu != value1:%llu\n",
|
|
|
|
next_key, value0[0], value1[0]);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int map_equal(int lru_map, int expected)
|
|
|
|
{
|
|
|
|
return map_subset(lru_map, expected) && map_subset(expected, lru_map);
|
|
|
|
}
|
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
static int sched_next_online(int pid, int *next_to_try)
|
2016-11-12 01:55:11 +07:00
|
|
|
{
|
|
|
|
cpu_set_t cpuset;
|
2017-01-17 13:17:29 +07:00
|
|
|
int next = *next_to_try;
|
|
|
|
int ret = -1;
|
2016-11-12 01:55:11 +07:00
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
while (next < nr_cpus) {
|
2016-11-12 01:55:11 +07:00
|
|
|
CPU_ZERO(&cpuset);
|
2017-01-17 13:17:29 +07:00
|
|
|
CPU_SET(next++, &cpuset);
|
|
|
|
if (!sched_setaffinity(pid, sizeof(cpuset), &cpuset)) {
|
|
|
|
ret = 0;
|
2016-11-12 01:55:11 +07:00
|
|
|
break;
|
2017-01-17 13:17:29 +07:00
|
|
|
}
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
*next_to_try = next;
|
|
|
|
return ret;
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Size of the LRU amp is 2
|
|
|
|
* Add key=1 (+1 key)
|
|
|
|
* Add key=2 (+1 key)
|
|
|
|
* Lookup Key=1
|
|
|
|
* Add Key=3
|
|
|
|
* => Key=2 will be removed by LRU
|
|
|
|
* Iterate map. Only found key=1 and key=3
|
|
|
|
*/
|
|
|
|
static void test_lru_sanity0(int map_type, int map_flags)
|
|
|
|
{
|
|
|
|
unsigned long long key, value[nr_cpus];
|
|
|
|
int lru_map_fd, expected_map_fd;
|
2017-01-17 13:17:29 +07:00
|
|
|
int next_cpu = 0;
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
printf("%s (map_type:%d map_flags:0x%X): ", __func__, map_type,
|
|
|
|
map_flags);
|
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
assert(sched_next_online(0, &next_cpu) != -1);
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
if (map_flags & BPF_F_NO_COMMON_LRU)
|
|
|
|
lru_map_fd = create_map(map_type, map_flags, 2 * nr_cpus);
|
|
|
|
else
|
|
|
|
lru_map_fd = create_map(map_type, map_flags, 2);
|
|
|
|
assert(lru_map_fd != -1);
|
|
|
|
|
|
|
|
expected_map_fd = create_map(BPF_MAP_TYPE_HASH, 0, 2);
|
|
|
|
assert(expected_map_fd != -1);
|
|
|
|
|
|
|
|
value[0] = 1234;
|
|
|
|
|
|
|
|
/* insert key=1 element */
|
|
|
|
|
|
|
|
key = 1;
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value, BPF_NOEXIST));
|
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
/* BPF_NOEXIST means: add new element if it doesn't exist */
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(bpf_map_update_elem(lru_map_fd, &key, value, BPF_NOEXIST) == -1
|
2016-11-12 01:55:11 +07:00
|
|
|
/* key=1 already exists */
|
2017-02-10 06:21:39 +07:00
|
|
|
&& errno == EEXIST);
|
2016-11-12 01:55:11 +07:00
|
|
|
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(bpf_map_update_elem(lru_map_fd, &key, value, -1) == -1 &&
|
2016-11-12 01:55:11 +07:00
|
|
|
errno == EINVAL);
|
|
|
|
|
|
|
|
/* insert key=2 element */
|
|
|
|
|
|
|
|
/* check that key=2 is not found */
|
|
|
|
key = 2;
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(bpf_map_lookup_elem(lru_map_fd, &key, value) == -1 &&
|
2016-11-12 01:55:11 +07:00
|
|
|
errno == ENOENT);
|
|
|
|
|
|
|
|
/* BPF_EXIST means: update existing element */
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(bpf_map_update_elem(lru_map_fd, &key, value, BPF_EXIST) == -1 &&
|
2016-11-12 01:55:11 +07:00
|
|
|
/* key=2 is not there */
|
|
|
|
errno == ENOENT);
|
|
|
|
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value, BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
/* insert key=3 element */
|
|
|
|
|
|
|
|
/* check that key=3 is not found */
|
|
|
|
key = 3;
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(bpf_map_lookup_elem(lru_map_fd, &key, value) == -1 &&
|
2016-11-12 01:55:11 +07:00
|
|
|
errno == ENOENT);
|
|
|
|
|
|
|
|
/* check that key=1 can be found and mark the ref bit to
|
|
|
|
* stop LRU from removing key=1
|
|
|
|
*/
|
|
|
|
key = 1;
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(!bpf_map_lookup_elem(lru_map_fd, &key, value));
|
2016-11-12 01:55:11 +07:00
|
|
|
assert(value[0] == 1234);
|
|
|
|
|
|
|
|
key = 3;
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value, BPF_NOEXIST));
|
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
/* key=2 has been removed from the LRU */
|
|
|
|
key = 2;
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(bpf_map_lookup_elem(lru_map_fd, &key, value) == -1);
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
assert(map_equal(lru_map_fd, expected_map_fd));
|
|
|
|
|
|
|
|
close(expected_map_fd);
|
|
|
|
close(lru_map_fd);
|
|
|
|
|
|
|
|
printf("Pass\n");
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Size of the LRU map is 1.5*tgt_free
|
|
|
|
* Insert 1 to tgt_free (+tgt_free keys)
|
|
|
|
* Lookup 1 to tgt_free/2
|
|
|
|
* Insert 1+tgt_free to 2*tgt_free (+tgt_free keys)
|
|
|
|
* => 1+tgt_free/2 to LOCALFREE_TARGET will be removed by LRU
|
|
|
|
*/
|
|
|
|
static void test_lru_sanity1(int map_type, int map_flags, unsigned int tgt_free)
|
|
|
|
{
|
|
|
|
unsigned long long key, end_key, value[nr_cpus];
|
|
|
|
int lru_map_fd, expected_map_fd;
|
|
|
|
unsigned int batch_size;
|
|
|
|
unsigned int map_size;
|
2017-01-17 13:17:29 +07:00
|
|
|
int next_cpu = 0;
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
if (map_flags & BPF_F_NO_COMMON_LRU)
|
|
|
|
/* Ther percpu lru list (i.e each cpu has its own LRU
|
|
|
|
* list) does not have a local free list. Hence,
|
|
|
|
* it will only free old nodes till there is no free
|
|
|
|
* from the LRU list. Hence, this test does not apply
|
|
|
|
* to BPF_F_NO_COMMON_LRU
|
|
|
|
*/
|
|
|
|
return;
|
|
|
|
|
|
|
|
printf("%s (map_type:%d map_flags:0x%X): ", __func__, map_type,
|
|
|
|
map_flags);
|
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
assert(sched_next_online(0, &next_cpu) != -1);
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
batch_size = tgt_free / 2;
|
|
|
|
assert(batch_size * 2 == tgt_free);
|
|
|
|
|
|
|
|
map_size = tgt_free + batch_size;
|
|
|
|
lru_map_fd = create_map(map_type, map_flags, map_size);
|
|
|
|
assert(lru_map_fd != -1);
|
|
|
|
|
|
|
|
expected_map_fd = create_map(BPF_MAP_TYPE_HASH, 0, map_size);
|
|
|
|
assert(expected_map_fd != -1);
|
|
|
|
|
|
|
|
value[0] = 1234;
|
|
|
|
|
|
|
|
/* Insert 1 to tgt_free (+tgt_free keys) */
|
|
|
|
end_key = 1 + tgt_free;
|
|
|
|
for (key = 1; key < end_key; key++)
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
/* Lookup 1 to tgt_free/2 */
|
|
|
|
end_key = 1 + batch_size;
|
|
|
|
for (key = 1; key < end_key; key++) {
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(!bpf_map_lookup_elem(lru_map_fd, &key, value));
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Insert 1+tgt_free to 2*tgt_free
|
|
|
|
* => 1+tgt_free/2 to LOCALFREE_TARGET will be
|
|
|
|
* removed by LRU
|
|
|
|
*/
|
|
|
|
key = 1 + tgt_free;
|
|
|
|
end_key = key + tgt_free;
|
|
|
|
for (; key < end_key; key++) {
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
assert(map_equal(lru_map_fd, expected_map_fd));
|
|
|
|
|
|
|
|
close(expected_map_fd);
|
|
|
|
close(lru_map_fd);
|
|
|
|
|
|
|
|
printf("Pass\n");
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Size of the LRU map 1.5 * tgt_free
|
|
|
|
* Insert 1 to tgt_free (+tgt_free keys)
|
|
|
|
* Update 1 to tgt_free/2
|
|
|
|
* => The original 1 to tgt_free/2 will be removed due to
|
|
|
|
* the LRU shrink process
|
|
|
|
* Re-insert 1 to tgt_free/2 again and do a lookup immeidately
|
|
|
|
* Insert 1+tgt_free to tgt_free*3/2
|
|
|
|
* Insert 1+tgt_free*3/2 to tgt_free*5/2
|
|
|
|
* => Key 1+tgt_free to tgt_free*3/2
|
|
|
|
* will be removed from LRU because it has never
|
|
|
|
* been lookup and ref bit is not set
|
|
|
|
*/
|
|
|
|
static void test_lru_sanity2(int map_type, int map_flags, unsigned int tgt_free)
|
|
|
|
{
|
|
|
|
unsigned long long key, value[nr_cpus];
|
|
|
|
unsigned long long end_key;
|
|
|
|
int lru_map_fd, expected_map_fd;
|
|
|
|
unsigned int batch_size;
|
|
|
|
unsigned int map_size;
|
2017-01-17 13:17:29 +07:00
|
|
|
int next_cpu = 0;
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
if (map_flags & BPF_F_NO_COMMON_LRU)
|
|
|
|
/* Ther percpu lru list (i.e each cpu has its own LRU
|
|
|
|
* list) does not have a local free list. Hence,
|
|
|
|
* it will only free old nodes till there is no free
|
|
|
|
* from the LRU list. Hence, this test does not apply
|
|
|
|
* to BPF_F_NO_COMMON_LRU
|
|
|
|
*/
|
|
|
|
return;
|
|
|
|
|
|
|
|
printf("%s (map_type:%d map_flags:0x%X): ", __func__, map_type,
|
|
|
|
map_flags);
|
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
assert(sched_next_online(0, &next_cpu) != -1);
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
batch_size = tgt_free / 2;
|
|
|
|
assert(batch_size * 2 == tgt_free);
|
|
|
|
|
|
|
|
map_size = tgt_free + batch_size;
|
|
|
|
if (map_flags & BPF_F_NO_COMMON_LRU)
|
|
|
|
lru_map_fd = create_map(map_type, map_flags,
|
|
|
|
map_size * nr_cpus);
|
|
|
|
else
|
|
|
|
lru_map_fd = create_map(map_type, map_flags, map_size);
|
|
|
|
assert(lru_map_fd != -1);
|
|
|
|
|
|
|
|
expected_map_fd = create_map(BPF_MAP_TYPE_HASH, 0, map_size);
|
|
|
|
assert(expected_map_fd != -1);
|
|
|
|
|
|
|
|
value[0] = 1234;
|
|
|
|
|
|
|
|
/* Insert 1 to tgt_free (+tgt_free keys) */
|
|
|
|
end_key = 1 + tgt_free;
|
|
|
|
for (key = 1; key < end_key; key++)
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
2017-02-10 06:21:39 +07:00
|
|
|
/* Any bpf_map_update_elem will require to acquire a new node
|
2016-11-12 01:55:11 +07:00
|
|
|
* from LRU first.
|
|
|
|
*
|
|
|
|
* The local list is running out of free nodes.
|
|
|
|
* It gets from the global LRU list which tries to
|
|
|
|
* shrink the inactive list to get tgt_free
|
|
|
|
* number of free nodes.
|
|
|
|
*
|
|
|
|
* Hence, the oldest key 1 to tgt_free/2
|
|
|
|
* are removed from the LRU list.
|
|
|
|
*/
|
|
|
|
key = 1;
|
|
|
|
if (map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2017-02-10 06:21:41 +07:00
|
|
|
assert(!bpf_map_delete_elem(lru_map_fd, &key));
|
2016-11-12 01:55:11 +07:00
|
|
|
} else {
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_EXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Re-insert 1 to tgt_free/2 again and do a lookup
|
|
|
|
* immeidately.
|
|
|
|
*/
|
|
|
|
end_key = 1 + batch_size;
|
|
|
|
value[0] = 4321;
|
|
|
|
for (key = 1; key < end_key; key++) {
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(bpf_map_lookup_elem(lru_map_fd, &key, value));
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(!bpf_map_lookup_elem(lru_map_fd, &key, value));
|
2016-11-12 01:55:11 +07:00
|
|
|
assert(value[0] == 4321);
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
value[0] = 1234;
|
|
|
|
|
|
|
|
/* Insert 1+tgt_free to tgt_free*3/2 */
|
|
|
|
end_key = 1 + tgt_free + batch_size;
|
|
|
|
for (key = 1 + tgt_free; key < end_key; key++)
|
|
|
|
/* These newly added but not referenced keys will be
|
|
|
|
* gone during the next LRU shrink.
|
|
|
|
*/
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
/* Insert 1+tgt_free*3/2 to tgt_free*5/2 */
|
|
|
|
end_key = key + tgt_free;
|
|
|
|
for (; key < end_key; key++) {
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
assert(map_equal(lru_map_fd, expected_map_fd));
|
|
|
|
|
|
|
|
close(expected_map_fd);
|
|
|
|
close(lru_map_fd);
|
|
|
|
|
|
|
|
printf("Pass\n");
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Size of the LRU map is 2*tgt_free
|
|
|
|
* It is to test the active/inactive list rotation
|
|
|
|
* Insert 1 to 2*tgt_free (+2*tgt_free keys)
|
|
|
|
* Lookup key 1 to tgt_free*3/2
|
|
|
|
* Add 1+2*tgt_free to tgt_free*5/2 (+tgt_free/2 keys)
|
|
|
|
* => key 1+tgt_free*3/2 to 2*tgt_free are removed from LRU
|
|
|
|
*/
|
|
|
|
static void test_lru_sanity3(int map_type, int map_flags, unsigned int tgt_free)
|
|
|
|
{
|
|
|
|
unsigned long long key, end_key, value[nr_cpus];
|
|
|
|
int lru_map_fd, expected_map_fd;
|
|
|
|
unsigned int batch_size;
|
|
|
|
unsigned int map_size;
|
2017-01-17 13:17:29 +07:00
|
|
|
int next_cpu = 0;
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
printf("%s (map_type:%d map_flags:0x%X): ", __func__, map_type,
|
|
|
|
map_flags);
|
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
assert(sched_next_online(0, &next_cpu) != -1);
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
batch_size = tgt_free / 2;
|
|
|
|
assert(batch_size * 2 == tgt_free);
|
|
|
|
|
|
|
|
map_size = tgt_free * 2;
|
|
|
|
if (map_flags & BPF_F_NO_COMMON_LRU)
|
|
|
|
lru_map_fd = create_map(map_type, map_flags,
|
|
|
|
map_size * nr_cpus);
|
|
|
|
else
|
|
|
|
lru_map_fd = create_map(map_type, map_flags, map_size);
|
|
|
|
assert(lru_map_fd != -1);
|
|
|
|
|
|
|
|
expected_map_fd = create_map(BPF_MAP_TYPE_HASH, 0, map_size);
|
|
|
|
assert(expected_map_fd != -1);
|
|
|
|
|
|
|
|
value[0] = 1234;
|
|
|
|
|
|
|
|
/* Insert 1 to 2*tgt_free (+2*tgt_free keys) */
|
|
|
|
end_key = 1 + (2 * tgt_free);
|
|
|
|
for (key = 1; key < end_key; key++)
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
/* Lookup key 1 to tgt_free*3/2 */
|
|
|
|
end_key = tgt_free + batch_size;
|
|
|
|
for (key = 1; key < end_key; key++) {
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(!bpf_map_lookup_elem(lru_map_fd, &key, value));
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Add 1+2*tgt_free to tgt_free*5/2
|
|
|
|
* (+tgt_free/2 keys)
|
|
|
|
*/
|
|
|
|
key = 2 * tgt_free + 1;
|
|
|
|
end_key = key + batch_size;
|
|
|
|
for (; key < end_key; key++) {
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
assert(map_equal(lru_map_fd, expected_map_fd));
|
|
|
|
|
|
|
|
close(expected_map_fd);
|
|
|
|
close(lru_map_fd);
|
|
|
|
|
|
|
|
printf("Pass\n");
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Test deletion */
|
|
|
|
static void test_lru_sanity4(int map_type, int map_flags, unsigned int tgt_free)
|
|
|
|
{
|
|
|
|
int lru_map_fd, expected_map_fd;
|
|
|
|
unsigned long long key, value[nr_cpus];
|
|
|
|
unsigned long long end_key;
|
2017-01-17 13:17:29 +07:00
|
|
|
int next_cpu = 0;
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
printf("%s (map_type:%d map_flags:0x%X): ", __func__, map_type,
|
|
|
|
map_flags);
|
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
assert(sched_next_online(0, &next_cpu) != -1);
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
if (map_flags & BPF_F_NO_COMMON_LRU)
|
|
|
|
lru_map_fd = create_map(map_type, map_flags,
|
|
|
|
3 * tgt_free * nr_cpus);
|
|
|
|
else
|
|
|
|
lru_map_fd = create_map(map_type, map_flags, 3 * tgt_free);
|
|
|
|
assert(lru_map_fd != -1);
|
|
|
|
|
|
|
|
expected_map_fd = create_map(BPF_MAP_TYPE_HASH, 0,
|
|
|
|
3 * tgt_free);
|
|
|
|
assert(expected_map_fd != -1);
|
|
|
|
|
|
|
|
value[0] = 1234;
|
|
|
|
|
|
|
|
for (key = 1; key <= 2 * tgt_free; key++)
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
key = 1;
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(bpf_map_update_elem(lru_map_fd, &key, value, BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
for (key = 1; key <= tgt_free; key++) {
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(!bpf_map_lookup_elem(lru_map_fd, &key, value));
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
for (; key <= 2 * tgt_free; key++) {
|
2017-02-10 06:21:41 +07:00
|
|
|
assert(!bpf_map_delete_elem(lru_map_fd, &key));
|
|
|
|
assert(bpf_map_delete_elem(lru_map_fd, &key));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
end_key = key + 2 * tgt_free;
|
|
|
|
for (; key < end_key; key++) {
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(lru_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
|
|
|
assert(!bpf_map_update_elem(expected_map_fd, &key, value,
|
|
|
|
BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
assert(map_equal(lru_map_fd, expected_map_fd));
|
|
|
|
|
|
|
|
close(expected_map_fd);
|
|
|
|
close(lru_map_fd);
|
|
|
|
|
|
|
|
printf("Pass\n");
|
|
|
|
}
|
|
|
|
|
|
|
|
static void do_test_lru_sanity5(unsigned long long last_key, int map_fd)
|
|
|
|
{
|
|
|
|
unsigned long long key, value[nr_cpus];
|
|
|
|
|
|
|
|
/* Ensure the last key inserted by previous CPU can be found */
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(!bpf_map_lookup_elem(map_fd, &last_key, value));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
value[0] = 1234;
|
|
|
|
|
|
|
|
key = last_key + 1;
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(map_fd, &key, value, BPF_NOEXIST));
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(!bpf_map_lookup_elem(map_fd, &key, value));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
/* Cannot find the last key because it was removed by LRU */
|
2017-02-10 06:21:40 +07:00
|
|
|
assert(bpf_map_lookup_elem(map_fd, &last_key, value));
|
2016-11-12 01:55:11 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Test map with only one element */
|
|
|
|
static void test_lru_sanity5(int map_type, int map_flags)
|
|
|
|
{
|
|
|
|
unsigned long long key, value[nr_cpus];
|
2017-01-17 13:17:29 +07:00
|
|
|
int next_cpu = 0;
|
2016-11-12 01:55:11 +07:00
|
|
|
int map_fd;
|
|
|
|
|
|
|
|
if (map_flags & BPF_F_NO_COMMON_LRU)
|
|
|
|
return;
|
|
|
|
|
|
|
|
printf("%s (map_type:%d map_flags:0x%X): ", __func__, map_type,
|
|
|
|
map_flags);
|
|
|
|
|
|
|
|
map_fd = create_map(map_type, map_flags, 1);
|
|
|
|
assert(map_fd != -1);
|
|
|
|
|
|
|
|
value[0] = 1234;
|
|
|
|
key = 0;
|
2017-02-10 06:21:39 +07:00
|
|
|
assert(!bpf_map_update_elem(map_fd, &key, value, BPF_NOEXIST));
|
2016-11-12 01:55:11 +07:00
|
|
|
|
2017-01-17 13:17:29 +07:00
|
|
|
while (sched_next_online(0, &next_cpu) != -1) {
|
2016-11-12 01:55:11 +07:00
|
|
|
pid_t pid;
|
|
|
|
|
|
|
|
pid = fork();
|
|
|
|
if (pid == 0) {
|
2017-01-17 13:17:29 +07:00
|
|
|
do_test_lru_sanity5(key, map_fd);
|
2016-11-12 01:55:11 +07:00
|
|
|
exit(0);
|
|
|
|
} else if (pid == -1) {
|
2017-01-17 13:17:29 +07:00
|
|
|
printf("couldn't spawn process to test key:%llu\n",
|
|
|
|
key);
|
2016-11-12 01:55:11 +07:00
|
|
|
exit(1);
|
|
|
|
} else {
|
|
|
|
int status;
|
|
|
|
|
|
|
|
assert(waitpid(pid, &status, 0) == pid);
|
|
|
|
assert(status == 0);
|
|
|
|
key++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
close(map_fd);
|
2017-01-17 13:17:29 +07:00
|
|
|
/* At least one key should be tested */
|
|
|
|
assert(key > 0);
|
2016-11-12 01:55:11 +07:00
|
|
|
|
|
|
|
printf("Pass\n");
|
|
|
|
}
|
|
|
|
|
|
|
|
int main(int argc, char **argv)
|
|
|
|
{
|
|
|
|
struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
|
|
|
|
int map_types[] = {BPF_MAP_TYPE_LRU_HASH,
|
|
|
|
BPF_MAP_TYPE_LRU_PERCPU_HASH};
|
|
|
|
int map_flags[] = {0, BPF_F_NO_COMMON_LRU};
|
|
|
|
int t, f;
|
|
|
|
|
|
|
|
setbuf(stdout, NULL);
|
|
|
|
|
|
|
|
assert(!setrlimit(RLIMIT_MEMLOCK, &r));
|
|
|
|
|
bpf: fix multiple issues in selftest suite and samples
1) The test_lru_map and test_lru_dist fails building on my machine since
the sys/resource.h header is not included.
2) test_verifier fails in one test case where we try to call an invalid
function, since the verifier log output changed wrt printing function
names.
3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for
retrieving the number of possible CPUs. This is broken at least in our
scenario and really just doesn't work.
glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF.
First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l,
if that fails, depending on the config, it either tries to count CPUs
in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead.
If /proc/cpuinfo has some issue, it returns just 1 worst case. This
oddity is nothing new [1], but semantics/behaviour seems to be settled.
_SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if
that fails it looks into /proc/stat for cpuX entries, and if also that
fails for some reason, /proc/cpuinfo is consulted (and returning 1 if
unlikely all breaks down).
While that might match num_possible_cpus() from the kernel in some
cases, it's really not guaranteed with CPU hotplugging, and can result
in a buffer overflow since the array in user space could have too few
number of slots, and on perpcu map lookup, the kernel will write beyond
that memory of the value buffer.
William Tu reported such mismatches:
[...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu()
happens when CPU hotadd is enabled. For example, in Fusion when
setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64
-smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf()
will be 2 [2]. [...]
Documentation/cputopology.txt says /sys/devices/system/cpu/possible
outputs cpu_possible_mask. That is the same as in num_possible_cpus(),
so first step would be to fix the _SC_NPROCESSORS_CONF calls with our
own implementation. Later, we could add support to bpf(2) for passing
a mask via CPU_SET(3), for example, to just select a subset of CPUs.
BPF samples code needs this fix as well (at least so that people stop
copying this). Thus, define bpf_num_possible_cpus() once in selftests
and import it from there for the sample code to avoid duplicating it.
The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated.
After all three issues are fixed, the test suite runs fine again:
# make run_tests | grep self
selftests: test_verifier [PASS]
selftests: test_maps [PASS]
selftests: test_lru_map [PASS]
selftests: test_kmod.sh [PASS]
[1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html
[2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html
Fixes: 3059303f59cf ("samples/bpf: update tracex[23] examples to use per-cpu maps")
Fixes: 86af8b4191d2 ("Add sample for adding simple drop program to link")
Fixes: df570f577231 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY")
Fixes: e15596717948 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH")
Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id")
Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: William Tu <u9012063@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-26 07:28:09 +07:00
|
|
|
nr_cpus = bpf_num_possible_cpus();
|
2016-11-12 01:55:11 +07:00
|
|
|
assert(nr_cpus != -1);
|
|
|
|
printf("nr_cpus:%d\n\n", nr_cpus);
|
|
|
|
|
|
|
|
for (f = 0; f < sizeof(map_flags) / sizeof(*map_flags); f++) {
|
|
|
|
unsigned int tgt_free = (map_flags[f] & BPF_F_NO_COMMON_LRU) ?
|
|
|
|
PERCPU_FREE_TARGET : LOCAL_FREE_TARGET;
|
|
|
|
|
|
|
|
for (t = 0; t < sizeof(map_types) / sizeof(*map_types); t++) {
|
|
|
|
test_lru_sanity0(map_types[t], map_flags[f]);
|
|
|
|
test_lru_sanity1(map_types[t], map_flags[f], tgt_free);
|
|
|
|
test_lru_sanity2(map_types[t], map_flags[f], tgt_free);
|
|
|
|
test_lru_sanity3(map_types[t], map_flags[f], tgt_free);
|
|
|
|
test_lru_sanity4(map_types[t], map_flags[f], tgt_free);
|
|
|
|
test_lru_sanity5(map_types[t], map_flags[f]);
|
|
|
|
|
|
|
|
printf("\n");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|