Fix segfault from bpftool by adding emit_obj_refs_plain when skeleton
code is disabled.
Tested by deleting BUILD_BPF_SKELS in Makefile. We found this doing
backports for Cilium when a testing image pulled in latest bpf-next
bpftool, but kept using an older clang-7.
# ./bpftool prog show
Error: bpftool built without PID iterator support
3: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2020-07-01T08:01:29-0700 uid 0
Segmentation fault
Fixes: d53dee3fe0 ("tools/bpftool: Show info for processes holding BPF map/prog/link/btf FDs")
Reported-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/159375071997.14984.17404504293832961401.stgit@john-XPS-13-9370
Kernel commit dc4e2801d4 (ring-buffer: Redefine the unimplemented
RINGBUF_TYPE_TIME_STAMP) changed the way the ring buffer timestamps work
- after that commit the previously unimplemented RINGBUF_TYPE_TIME_STAMP
type causes the time delta to be used as a timestamp rather than a delta
to be added to the timestamp.
The trace-cmd code didn't get updated to handle this, so misinterprets
the event data for this case, which causes a cascade of errors,
including trace-report not being able to identify synthetic (or any
other) events generated by the histogram code (which uses TIME_STAMP
mode). For example, the following triggers along with the trace-cmd
shown cause an UNKNOWN_EVENT error and trace-cmd report crash:
# echo 'wakeup_latency u64 lat pid_t pid char comm[16]' > /sys/kernel/debug/tracing/synthetic_events
# echo 'hist:keys=pid:ts0=common_timestamp.usecs if comm=="ping"' > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_wakeup).trace(wakeup_latency,$wakeup_lat,next_pid,next_comm) if next_comm=="ping"' > /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
# echo 'hist:keys=comm,pid,lat:wakeup_lat=lat:sort=lat' > /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
# trace-cmd record -e wakeup_latency -e sched_wakeup -f comm==\"ping\" ping localhost -c 5
# trace-cmd report
CPU 0 is empty
CPU 1 is empty
CPU 2 is empty
CPU 3 is empty
CPU 5 is empty
CPU 6 is empty
CPU 7 is empty
cpus=8
ug! no event found for type 0
[UNKNOWN TYPE 0]
ug! no event found for type 11520
Segmentation fault (core dumped)
After this patch we get the correct interpretation and the events are
shown properly:
# trace-cmd report
CPU 0 is empty
CPU 1 is empty
CPU 2 is empty
CPU 3 is empty
CPU 5 is empty
CPU 6 is empty
CPU 7 is empty
cpus=8
<idle>-0 [004] 23284.341392: sched_wakeup: ping:12031 [120] success=1 CPU:004
<idle>-0 [004] 23284.341464: wakeup_latency: lat=58, pid=12031, comm=ping
<idle>-0 [004] 23285.365303: sched_wakeup: ping:12031 [120] success=1 CPU:004
<idle>-0 [004] 23285.365382: wakeup_latency: lat=64, pid=12031, comm=ping
<idle>-0 [004] 23286.389290: sched_wakeup: ping:12031 [120] success=1 CPU:004
<idle>-0 [004] 23286.389378: wakeup_latency: lat=72, pid=12031, comm=ping
<idle>-0 [004] 23287.413213: sched_wakeup: ping:12031 [120] success=1 CPU:004
<idle>-0 [004] 23287.413291: wakeup_latency: lat=64, pid=12031, comm=ping
Link: http://lkml.kernel.org/r/1567628224.13841.4.camel@kernel.org
Link: http://lore.kernel.org/linux-trace-devel/20200625100516.365338-3-tz.stoyanov@gmail.com
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
[ Ported from trace-cmd.git ]
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200702185703.785094515@goodmis.org
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the functions kbuffer_subbuf_timestamp() and kbuffer_ptr_delta() to
get the timing data stored in the ring buffer that is used to produced
the time stamps of the records.
This is useful for tools like trace-cmd to be able to display the
content of the read data to understand why the records show the time
stamps that they do.
Link: http://lore.kernel.org/linux-trace-devel/20200625100516.365338-2-tz.stoyanov@gmail.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
[ Ported from trace-cmd.git ]
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200702185703.619656282@goodmis.org
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using Python version 3.8.2 and PySide2 version 5.14.0, time chart call tree
would not expand the tree to the result. Fix by using setExpanded().
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Charts -> Time chart by CPU
Move mouse over middle of chart
Right-click and select Show Call Tree
Before: displays Call Tree but not expanded to selected time
After: displays Call Tree expanded to selected time
Fixes: e69d5df75d ("perf scripts python: exported-sql-viewer.py: Add ability for Call tree to open at a specified task and time")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using ctrl-F ('Find') would not find 'unknown' because it matches id
zero. Fix by excluding id zero from selection.
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Call Tree
Press: Ctrl-F
Enter: unknown
Press: Enter
Before: displays 'unknown' not found
After: tree is expanded to line showing 'unknown'
Fixes: ae8b887c00 ("perf scripts python: exported-sql-viewer.py: Add call tree")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using ctrl-F ('Find') would not find 'unknown' because it matches id zero.
Fix by excluding id zero from selection.
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Context-Sensitive Call Graph
Press: Ctrl-F
Enter: unknown
Press: Enter
Before: gets stuck
After: tree is expanded to line showing 'unknown'
Fixes: 254c0d820b ("perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using Python version 3.8.2 and PySide2 version 5.14.0, ctrl-F ('Find')
would not expand the tree to the result. Fix by using setExpanded().
Example:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
2020-06-26 15:32:14.928997 Creating database ...
2020-06-26 15:32:14.933971 Writing records...
2020-06-26 15:32:15.535251 Adding indexes
2020-06-26 15:32:15.542993 Dropping unused tables
2020-06-26 15:32:15.549716 Done
$ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db
Select: Reports -> Context-Sensitive Call Graph or Reports -> Call Tree
Press: Ctrl-F
Enter: main
Press: Enter
Before: line showing 'main' does not display
After: tree is expanded to line showing 'main'
Fixes: ebd70c7dc2 ("perf scripts python: exported-sql-viewer.py: Add ability to find symbols in the call-graph")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 0a892c1c94 ("perf record: Add dummy event during system wide
synthesis") reveals an issue with Intel PT system wide tracing.
Specifically that Intel PT already adds a dummy tracking event, and it
is not the first event. Adding another dummy tracking event causes
duplicated sideband events. Fix by checking for an existing dummy
tracking event first.
Example showing duplicated switch events:
Before:
# perf record -a -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.895 MB perf.data ]
# perf script --no-itrace --show-switch-events | head
swapper 0 [007] 6390.516222: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
swapper 0 [007] 6390.516222: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [007] 6390.516223: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [007] 6390.516224: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [007] 6390.516227: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
rcu_sched 11 [007] 6390.516227: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [007] 6390.516228: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [007] 6390.516228: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [002] 6390.516415: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 5556/5559
swapper 0 [002] 6390.516416: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 5556/5559
After:
# perf record -a -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.868 MB perf.data ]
# perf script --no-itrace --show-switch-events | head
swapper 0 [005] 6450.567013: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 7179/7181
perf 7181 [005] 6450.567014: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
perf 7181 [005] 6450.567028: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [005] 6450.567029: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 7179/7181
swapper 0 [005] 6450.571699: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [005] 6450.571700: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_sched 11 [005] 6450.571702: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [005] 6450.571703: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
swapper 0 [005] 6450.579703: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_sched 11 [005] 6450.579704: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200629091955.17090-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To bring in the change made in this cset:
e3a9e681ad ("x86/entry: Fixup bad_iret vs noinstr")
This doesn't cause any functional changes to tooling, just a rebuild.
Addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This kselftest fixes update for Linux 5.8-rc4 consists of tpm test
fixes from arkko Sakkinen.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl79/1wACgkQCwJExA0N
QxxDOg//Q5SUe3nLy3AVKats4Ma2cuIi1hb1vsrLpGbl3OU9Bbf9uV84DBrXzb0C
1Mshp9hxGWphbgkBL9Bc4/DJ3dX6WXbZXDfOblHn15w1SaD6HC/8WAU802KR9gZj
KqfFUMTncDeejk6KET8UVbphG1rVax0AQK0TAB4SnHPLt3noZYRvdVfEiCyKIQ6o
zBvXsweKKH7QehoACtm4myVnQnDhdhvoSvJt8nRox2EU38fCR+MSfYZmDHaakDpp
0JYEPVIvk+vn2qfcGDuFax4AzfNrRfe9NbaqEX9g7vVCrOITNGPQOsH/+3UHj/9r
scTRE8ItN4oHYEVtnWbIIBVYZt6VKVYbWVvfVoS2XQpaEFs3qAw241rCpb7Cg0E6
3f6QEVnhkFCNI9VCA1Z30puWQ6HmuroeHLBAfr8CB2Z14WC4kadMzUbTnj0BgJAh
7GACDPVy6ZOsqS5WUZ2v7kMqxJAJYwPUH+xZaW3M7yb5O9BdXsPDSDXblSOGhJX6
C+n/lK3DtsIYoMLW5eeRFfb5V8YZRfBP6nvjnuopgYm0H7BNfflIpybljR9lpAMo
0LRbkdEjTS9TpX7OravT7qZY3VWyr68wZLTGBQUp6lhFJhS4qjjL9mjefLBy9XQb
37VsnYXOUfnwcFXXXjxFlhVKqAIr14bE+lI0mfbnY4npqKvwjFY=
=RDv/
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-fixes-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"tpm test fixes from Jarkko Sakkinen"
* tag 'linux-kselftest-fixes-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: tpm: Use /bin/sh instead of /bin/bash
selftests: tpm: Use 'test -e' instead of 'test -f'
Revert "tpm: selftest: cleanup after unseal with wrong auth/policy test"
This kunit fixes update for Linux 5.8-rc4 consists of fixes to build
and run-times failures. Also includes troubleshooting tips updates
to kunit user documentation.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl7985cACgkQCwJExA0N
QxxFvhAA6Wa+1UMR4VKLZgfc2dL85LeV9tZO714ssIt1rxcgv2dHswsE0nbJmHyM
DsfEufqOsvpX7/ic/JqrwIl+iDGrKlV9wo+ZLl+tdt5jVeB6OP6Dr5C3jvD3eZhX
zUHxr04QGzQuJnS6gAOIrCa/qBz17duAEij6xj4if/6OAkL2Igb3PGFzhpjVKqJL
TLY5UJ80D+QHJ7o8FWsaB8bNDMu7gmOBgfMb1qGB60cFppE+regoQRtkZefLap26
MixOFgRD5DyNoGqTZzJqSn7IZxvERoHfxKchzpAUHsNn9tI0r15X016Wcgf2+B+T
2eyRJDkTP3dt4oFuML4CXeQvZOgrcZNIWeVFmBK9NcmRg0WDnWPzCE2Mm+lnZD8e
0fefiaLBZw5+ztaz24S/M3mTpZQru8N2FDgLJmpLcPulIuDYpm4tB2PkBc0AmF35
6gC3WDa6cw1qbbDgN83xd9VdlACBe2fYzenhZCqDzgE1zGquORkhuAQYZfdGrixi
ojpn7IKN+JeufiFZuu1xOJeAojIZ4JU42FxM0S1PSXf9deqICzfa1LSOWEaL+V4G
GaPq/nnMhtY2rMGFAQXyCP4YQe2XQU/Jt1SOdFA/UZ1W+oYXwjOlSVo9xpjJV/6y
4TAQ7Yg8S87CUbffYpBLw3Xkg8E0L9ih+E+UOineMcUiu6yxA2Q=
=XeG4
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-kunit-fixes-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fixes from Shuah Khan
"Fixes for build and run-times failures.
Also includes troubleshooting tips updates to kunit user
documentation"
* tag 'linux-kselftest-kunit-fixes-5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
Documentation: kunit: Add some troubleshooting tips to the FAQ
kunit: kunit_tool: Fix invalid result when build fails
kunit: show error if kunit results are not present
kunit: kunit_config: Fix parsing of CONFIG options with space
It is common for networking tests creating its netns and making its own
setting under this new netns (e.g. changing tcp sysctl). If the test
forgot to restore to the original netns, it would affect the
result of other tests.
This patch saves the original netns at the beginning and then restores it
after every test. Since the restore "setns()" is not expensive, it does it
on all tests without tracking if a test has created a new netns or not.
The new restore_netns() could also be done in test__end_subtest() such
that each subtest will get an automatic netns reset. However,
the individual test would lose flexibility to have total control
on netns for its own subtests. In some cases, forcing a test to do
unnecessary netns re-configure for each subtest is time consuming.
e.g. In my vm, forcing netns re-configure on each subtest in sk_assign.c
increased the runtime from 1s to 8s. On top of that, test_progs.c
is also doing per-test (instead of per-subtest) cleanup for cgroup.
Thus, this patch also does per-test restore_netns(). The only existing
per-subtest cleanup is reset_affinity() and no test is depending on this.
Thus, it is removed from test__end_subtest() to give a consistent
expectation to the individual tests. test_progs.c only ensures
any affinity/netns/cgroup change made by an earlier test does not
affect the following tests.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200702004858.2103728-1-kafai@fb.com
This patch makes a few changes to the network_helpers.c
1) Enforce SO_RCVTIMEO and SO_SNDTIMEO
This patch enforces timeout to the network fds through setsockopt
SO_RCVTIMEO and SO_SNDTIMEO.
It will remove the need for SOCK_NONBLOCK that requires a more demanding
timeout logic with epoll/select, e.g. epoll_create, epoll_ctrl, and
then epoll_wait for timeout.
That removes the need for connect_wait() from the
cgroup_skb_sk_lookup.c. The needed change is made in
cgroup_skb_sk_lookup.c.
2) start_server():
Add optional addr_str and port to start_server().
That removes the need of the start_server_with_port(). The caller
can pass addr_str==NULL and/or port==0.
I have a future tcp-hdr-opt test that will pass a non-NULL addr_str
and it is in general useful for other future tests.
"int timeout_ms" is also added to control the timeout
on the "accept(listen_fd)".
3) connect_to_fd(): Fully use the server_fd.
The server sock address has already been obtained from
getsockname(server_fd). The sockaddr includes the family,
so the "int family" arg is redundant.
Since the server address is obtained from server_fd, there
is little reason not to get the server's socket type from the
server_fd also. getsockopt(server_fd) can be used to do that,
so "int type" arg is also removed.
"int timeout_ms" is added.
4) connect_fd_to_fd():
"int timeout_ms" is added.
Some code is also refactored to connect_fd_to_addr() which is
shared with connect_to_fd().
5) Preserve errno:
Some callers need to check errno, e.g. cgroup_skb_sk_lookup.c.
Make changes to do it more consistently in save_errno_close()
and log_err().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200702004852.2103003-1-kafai@fb.com
The script generates two random files that are then sent via tcp and
mptcp connections.
In order to compare throughput over consecutive runs add an option
to provide the file size on the command line: "-f 128000".
Also add an option, -t, to enable tcp tests. This is useful to
compare throughput of mptcp connections and tcp connections.
Example: run tests with a 4mb file size, 300ms delay 0.01% loss,
default gso/tso/gro settings and with large write/blocking io:
mptcp_connect.sh -t -f $((4 * 1024 * 1024)) -d 300 -l 0.01% -r 0 -e "" -m mmap
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The program test_progs have some very useful ability to specify a list of
test name substrings for selecting which tests to run.
This patch add the ability to list the selected test names without running
them. This is practical for seeing which tests gets selected with given
select arguments (which can also contain a exclude list via --name-blacklist).
This output can also be used by shell-scripts in a for-loop:
for N in $(./test_progs --list -t xdp); do \
./test_progs -t $N 2>&1 > result_test_${N}.log & \
done ; wait
This features can also be used for looking up a test number and returning
a testname. If the selection was empty then a shell EXIT_FAILURE is
returned. This is useful for scripting. e.g. like this:
n=1;
while [ $(./test_progs --list -n $n) ] ; do \
./test_progs -n $n ; n=$(( n+1 )); \
done
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363985751.930467.9610992940793316982.stgit@firesoul
It can be practial to get the number of tests that test_progs contain.
This could for example be used to create a shell for-loop construct that
runs the individual tests.
Like:
for N in $(seq 1 $(./test_progs -c)); do
./test_progs -n $N 2>&1 > result_test_${N}.log &
done ; wait
V2: Add the ability to return the count for the selected tests. This is
useful for getting a count e.g. after excluding some tests with option -b.
The current beakers test script like to report the max test count upfront.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363985244.930467.12617117873058936829.stgit@firesoul
When a user selects a non-existing test the summary is printed with
indication 0 for all info types, and shell "success" (EXIT_SUCCESS) is
indicated. This can be understood by a human end-user, but for shell
scripting is it useful to indicate a shell failure (EXIT_FAILURE).
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363984736.930467.17956007131403952343.stgit@firesoul
Turn off -Wnested-externs to avoid annoying warnings in BUILD_BUG_ON macro when
compiling bpftool:
In file included from /data/users/andriin/linux/tools/include/linux/build_bug.h:5,
from /data/users/andriin/linux/tools/include/linux/kernel.h:8,
from /data/users/andriin/linux/kernel/bpf/disasm.h:10,
from /data/users/andriin/linux/kernel/bpf/disasm.c:8:
/data/users/andriin/linux/kernel/bpf/disasm.c: In function ‘__func_get_name’:
/data/users/andriin/linux/tools/include/linux/compiler.h:37:38: warning: nested extern declaration of ‘__compiletime_assert_0’ [-Wnested-externs]
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^~~~~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/compiler.h:16:15: note: in definition of macro ‘__compiletime_assert’
extern void prefix ## suffix(void) __compiletime_error(msg); \
^~~~~~
/data/users/andriin/linux/tools/include/linux/compiler.h:37:2: note: in expansion of macro ‘_compiletime_assert’
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^~~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
^~~~~~~~~~~~~~~~
/data/users/andriin/linux/kernel/bpf/disasm.c:20:2: note: in expansion of macro ‘BUILD_BUG_ON’
BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID);
^~~~~~~~~~~~
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200701212816.2072340-1-andriin@fb.com
The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
programs. But in a kernel built by clang, which performs more aggresive
inlining, that function gets inlined into its caller SyS_nanosleep.
Therefore, even though fentry and kprobe do hook on the function,
they aren't triggered by the call to nanosleep in the test.
A possible fix is switching to use a function that is less likely to
be inlined, such as hrtimer_range_start_ns. The EXPORT_SYMBOL functions
shouldn't be inlined based on the description of [1], therefore safe
to use for this test. Also the arguments of this function include the
duration of sleep, therefore suitable for test verification.
[1] af3b56289b time: don't inline EXPORT_SYMBOL functions
Tested:
In a clang build kernel, before this change, the test fails:
test_vmlinux:PASS:skel_open 0 nsec
test_vmlinux:PASS:skel_attach 0 nsec
test_vmlinux:PASS:tp 0 nsec
test_vmlinux:PASS:raw_tp 0 nsec
test_vmlinux:PASS:tp_btf 0 nsec
test_vmlinux:FAIL:kprobe not called
test_vmlinux:FAIL:fentry not called
After switching to hrtimer_range_start_ns, the test passes:
test_vmlinux:PASS:skel_open 0 nsec
test_vmlinux:PASS:skel_attach 0 nsec
test_vmlinux:PASS:tp 0 nsec
test_vmlinux:PASS:raw_tp 0 nsec
test_vmlinux:PASS:tp_btf 0 nsec
test_vmlinux:PASS:kprobe 0 nsec
test_vmlinux:PASS:fentry 0 nsec
Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200701175315.1161242-1-haoluo@google.com
The new test is similar to other bpf_iter tests. It dumps all
/proc/<pid>/stack to a seq_file. Here is some example output:
pid: 2873 num_entries: 3
[<0>] worker_thread+0xc6/0x380
[<0>] kthread+0x135/0x150
[<0>] ret_from_fork+0x22/0x30
pid: 2874 num_entries: 9
[<0>] __bpf_get_stack+0x15e/0x250
[<0>] bpf_prog_22a400774977bb30_dump_task_stack+0x4a/0xb3c
[<0>] bpf_iter_run_prog+0x81/0x170
[<0>] __task_seq_show+0x58/0x80
[<0>] bpf_seq_read+0x1c3/0x3b0
[<0>] vfs_read+0x9e/0x170
[<0>] ksys_read+0xa7/0xe0
[<0>] do_syscall_64+0x4c/0xa0
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Note: bpf_iter test as-is doesn't print the contents of the seq_file. To
see the example above, it is necessary to add printf() to do_dummy_read.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-5-songliubraving@fb.com
Introduce helper bpf_get_task_stack(), which dumps stack trace of given
task. This is different to bpf_get_stack(), which gets stack track of
current task. One potential use case of bpf_get_task_stack() is to call
it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file.
bpf_get_task_stack() uses stack_trace_save_tsk() instead of
get_perf_callchain() for kernel stack. The benefit of this choice is that
stack_trace_save_tsk() doesn't require changes in arch/. The downside of
using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the
stack trace to unsigned long array. For 32-bit systems, we need to
translate it to u64 array.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com
There are several copies of get_eflags() and set_eflags() and they all are
buggy. Consolidate them and fix them. The fixes are:
Add memory clobbers. These are probably unnecessary but they make sure
that the compiler doesn't move something past one of these calls when it
shouldn't.
Respect the redzone on x86_64. There has no failure been observed related
to this, but it's definitely a bug.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/982ce58ae8dea2f1e57093ee894760e35267e751.1593191971.git.luto@kernel.org
Make bpf_endian.h compatible with vmlinux.h. It is a frequent request from
users wanting to use bpf_endian.h in their BPF applications using CO-RE and
vmlinux.h.
To achieve that, re-implement byte swap macros and drop all the header
includes. This way it can be used both with linux header includes, as well as
with a vmlinux.h.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630152125.3631920-2-andriin@fb.com
Similarly to bpftool Makefile, allow to specify custom location of vmlinux.h
to be used during the build. This allows simpler testing setups with
checked-in pre-generated vmlinux.h.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630004759.521530-2-andriin@fb.com
In some build contexts (e.g., Travis CI build for outdated kernel), vmlinux.h,
generated from available kernel, doesn't contain all the types necessary for
BPF program compilation. For such set up, the most maintainable way to deal
with this problem is to keep pre-generated (almost up-to-date) vmlinux.h
checked in and use it for compilation purposes. bpftool after that can deal
with kernel missing some of the features in runtime with no problems.
To that effect, allow to specify path to custom vmlinux.h to bpftool's
Makefile with VMLINUX_H variable.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630004759.521530-1-andriin@fb.com
Daniel Borkmann says:
====================
pull-request: bpf 2020-06-30
The following pull-request contains BPF updates for your *net* tree.
We've added 28 non-merge commits during the last 9 day(s) which contain
a total of 35 files changed, 486 insertions(+), 232 deletions(-).
The main changes are:
1) Fix an incorrect verifier branch elimination for PTR_TO_BTF_ID pointer
types, from Yonghong Song.
2) Fix UAPI for sockmap and flow_dissector progs that were ignoring various
arguments passed to BPF_PROG_{ATTACH,DETACH}, from Lorenz Bauer & Jakub Sitnicki.
3) Fix broken AF_XDP DMA hacks that are poking into dma-direct and swiotlb
internals and integrate it properly into DMA core, from Christoph Hellwig.
4) Fix RCU splat from recent changes to avoid skipping ingress policy when
kTLS is enabled, from John Fastabend.
5) Fix BPF ringbuf map to enforce size to be the power of 2 in order for its
position masking to work, from Andrii Nakryiko.
6) Fix regression from CAP_BPF work to re-allow CAP_SYS_ADMIN for loading
of network programs, from Maciej Żenczykowski.
7) Fix libbpf section name prefix for devmap progs, from Jesper Dangaard Brouer.
8) Fix formatting in UAPI documentation for BPF helpers, from Quentin Monnet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add two tests for PTR_TO_BTF_ID vs. null ptr comparison,
one for PTR_TO_BTF_ID in the ctx structure and the
other for PTR_TO_BTF_ID after one level pointer chasing.
In both cases, the test ensures condition is not
removed.
For example, for this test
struct bpf_fentry_test_t {
struct bpf_fentry_test_t *a;
};
int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
{
if (arg == 0)
test7_result = 1;
return 0;
}
Before the previous verifier change, we have xlated codes:
int test7(long long unsigned int * ctx):
; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
0: (79) r1 = *(u64 *)(r1 +0)
; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
1: (b4) w0 = 0
2: (95) exit
After the previous verifier change, we have:
int test7(long long unsigned int * ctx):
; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
0: (79) r1 = *(u64 *)(r1 +0)
; if (arg == 0)
1: (55) if r1 != 0x0 goto pc+4
; test7_result = 1;
2: (18) r1 = map[id:6][0]+48
4: (b7) r2 = 1
5: (7b) *(u64 *)(r1 +0) = r2
; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
6: (b4) w0 = 0
7: (95) exit
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630171241.2523875-1-yhs@fb.com
Calling bpf_prog_detach is incorrect, since it takes target_fd as
its argument. The intention here is to pass it as attach_bpf_fd,
so use bpf_prog_detach2 and pass zero for target_fd.
Fixes: 06716e04a0 ("selftests/bpf: Extend test_flow_dissector to cover link creation")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200629095630.7933-7-lmb@cloudflare.com
Pass 0 as target_fd when attaching and detaching flow dissector.
Additionally, pass the expected program when detaching.
Fixes: 1f043f87bb ("selftests/bpf: Add tests for attaching bpf_link to netns")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200629095630.7933-6-lmb@cloudflare.com
This case, while not particularly useful, is worth covering because we
expect the operation to succeed as opposed when re-attaching the same
program directly with PROG_ATTACH.
While at it, update the tests summary that fell out of sync when tests
extended to cover links.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200625141357.910330-5-jakub@cloudflare.com
Add tests to check ethtool report about extended state.
The tests configure several states and verify that the correct extended
state is reported by ethtool.
Check extended state with substate (Autoneg) and extended state without
substate (No cable).
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add NETIF_NO_CABLE port to tests topology.
The port can also be declared as an environment variable and tests can be
run like that:
NETIF_NO_CABLE=eth9 ./test.sh eth{1..8}
The NETIF_NO_CABLE port will be used by ethtool_extended_state test.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently different_speeds_get() is used only by ethtool.sh tests.
The function can be useful for another tests that check ethtool
configurations.
Move the function to ethtool_lib in order to allow other tests to use
it.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This test is inspired by the mlxsw RED selftest. It is much simpler to set
up (also because there is no point in testing PRIO / RED encapsulation). It
tests bare RED, ECN and ECN+nodrop modes of operation. On top of that it
tests RED early_drop and mark qevents.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The reverted commit illegitly uses tpm2-tools. External dependencies are
absolutely forbidden from these tests. There is also the problem that
clearing is not necessarily wanted behavior if the test/target computer is
not used only solely for testing.
Fixes: a9920d3bad ("tpm: selftest: cleanup after unseal with wrong auth/policy test")
Cc: Tadeusz Struk <tadeusz.struk@intel.com>
Cc: stable@vger.kernel.org
Cc: linux-integrity@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Address KCOV vs noinstr. There is no function attribute to selectively
suppress KCOV instrumentation, instead teach objtool to NOP out the
calls in noinstr functions.
This cures a bunch of KCOV crashes (as used by syzcaller).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl74pC0ACgkQEsHwGGHe
VUobXA//cJvRCujUriL6HjZZxmqrWKYyB4kH4yFVycJ7DRflGk3QGLpnHJifWWUL
eG50obtNI+KOWrr/0lY7XURZgr1mVDe0L3z0tdBJH/rCiQPraDf2JPpCSRRtdq/a
MvbRXE14z3YLeRI2CurRBH+ZmveBRu2Gv9APPym0CqGBhX3rRRKoyOOiQS95PCZB
pehuYjbLLrLCQvFoANq3ZwHyLZzczhhwgVBSl+UgdDBwrbM5VC6ByxtEkRgcwoqt
Tvhji0HqjV4Nqu23/PUsR53hkp+kQrdfe2vaC7IeISWxusMTXCMFOYlZNR4xnQ/f
M7No8eZK+/j7KsI6/8hfRMvTeis21IMUCV9gRXZYpSWfbf4vKBsYFoIAMxQTNyBo
t/7BUqwTA9eLtUoaTCZim5a/n1nNWWPnnd74DYmQ7KilGgS3HO9dDwNrPnJhDUYZ
Ed6Wb0Jgk4s8+TxQEEx8j9bVfpxJGuL+BzcrqdRSCIHV12CRRzUigSadW5/4OR6S
XNVzY1Si0RGKI5K3OJAZDP5YaPWNXu8SwQUzaZDXjt8qavljqvDfY7GXIdhRNPCY
6o/H8i/iHXn5v3nSpGKrAeDBqXP8BncvP2ux1Zs3/uBdPgU1dFcYBrUEZxStjDWU
tyX6tNIU7pGMvXSiEsKzSpb1/LkzR+zG7z//DC3WCYNqP4KdaoE=
=0Wd6
-----END PGP SIGNATURE-----
Merge tag 'objtool_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Borislav Petkov:
"Three fixes from Peter Zijlstra suppressing KCOV instrumentation in
noinstr sections.
Peter Zijlstra says:
"Address KCOV vs noinstr. There is no function attribute to
selectively suppress KCOV instrumentation, instead teach objtool
to NOP out the calls in noinstr functions"
This cures a bunch of KCOV crashes (as used by syzcaller)"
* tag 'objtool_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix noinstr vs KCOV
objtool: Provide elf_write_{insn,reloc}()
objtool: Clean up elf_write() condition
Validate that BPF object with broken (in multiple ways) BPF program can still
be successfully loaded, if that broken BPF program is disabled.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200625232629.3444003-3-andriin@fb.com
Currently, bpf_object__load() (and by induction skeleton's load), will always
attempt to prepare, relocate, and load into kernel every single BPF program
found inside the BPF object file. This is often convenient and the right thing
to do and what users expect.
But there are plenty of cases (especially with BPF development constantly
picking up the pace), where BPF application is intended to work with old
kernels, with potentially reduced set of features. But on kernels supporting
extra features, it would like to take a full advantage of them, by employing
extra BPF program. This could be a choice of using fentry/fexit over
kprobe/kretprobe, if kernel is recent enough and is built with BTF. Or BPF
program might be providing optimized bpf_iter-based solution that user-space
might want to use, whenever available. And so on.
With libbpf and BPF CO-RE in particular, it's advantageous to not have to
maintain two separate BPF object files to achieve this. So to enable such use
cases, this patch adds ability to request not auto-loading chosen BPF
programs. In such case, libbpf won't attempt to perform relocations (which
might fail due to old kernel), won't try to resolve BTF types for
BTF-aware (tp_btf/fentry/fexit/etc) program types, because BTF might not be
present, and so on. Skeleton will also automatically skip auto-attachment step
for such not loaded BPF programs.
Overall, this feature allows to simplify development and deployment of
real-world BPF applications with complicated compatibility requirements.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200625232629.3444003-2-andriin@fb.com
These patches address a number of instrumentation issues that were found after
the x86/entry overhaul. When combined with rcu/urgent and objtool/urgent, these
patches make UBSAN/KASAN/KCSAN happy again.
Part of making this all work is bumping the minimum GCC version for KASAN
builds to gcc-8.3, the reason for this is that the __no_sanitize_address
function attribute is broken in GCC releases before that.
No known GCC version has a working __no_sanitize_undefined, however because the
only noinstr violation that results from this happens when an UB is found, we
treat it like WARN. That is, we allow it to violate the noinstr rules in order
to get the warning out.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl74oWMACgkQEsHwGGHe
VUpZCw/5AfanXrEixuh4hZLPBOJ7MtW0YI3eyBRJ8j14R8iaK+Hvn/yU4/+qC2jj
eAlc42QS6Ckzcdknyy8VpHVDR7LR2angN0ePJmrbKsjYq0LTrnfa2H5uABcAQoiW
0BuGFub0QBRjCkxgsOoG3llqWsTkhRrGX1928lCuuK+8L+kB0bREGMqpR36EBFaS
wIyLodLO/Bd+YcoWDMvm4I6FvHcdyY3Oq++mzro+5ye7bE9s0PpMC5IXNzmIuGmR
31UvST+ooRMsM6GlhxHpn6pZuCqfjygXAYuuutwdK10g1f75ESkQdYz9T9KDlHrF
4GqzcCGtOlN4DAvk3L7KGfHw3XIhioGFxeRT+gGgKsnxoBjvJXJ8x9GrcLA9jdJi
WeqlqiEOiAa949nclwQQ+fSrx4LgLhJ8bexyOkwiRPx7R75Y0e6OqpxZtE6GiL8O
BA6Z6cR7U8H4uhKIzZZ0NJiLwO1cSGo5Uz/ERcyg4L23rHYKrDdaQwFSDUxXWq/s
2lEqISD0WrSwMxJtfET3zB0B20n6IO7Uszo0FdnDFO62fck8HlStZsqV4meoT2Cc
moqIZsYc3qnESxO9OhWHdSGGAyGS0qcE4Sq/oM8d2dIvIeL4KwHqTE6QFSmcUivi
QYdXIIQnqJgqX4dmvLFrTuI2Whc86oS40U5/Dhv7BlHx0oewSlg=
=fcu1
-----END PGP SIGNATURE-----
Merge tag 'x86_entry_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 entry fixes from Borislav Petkov:
"This is the x86/entry urgent pile which has accumulated since the
merge window.
It is not the smallest but considering the almost complete entry core
rewrite, the amount of fixes to follow is somewhat higher than usual,
which is to be expected.
Peter Zijlstra says:
'These patches address a number of instrumentation issues that were
found after the x86/entry overhaul. When combined with rcu/urgent
and objtool/urgent, these patches make UBSAN/KASAN/KCSAN happy
again.
Part of making this all work is bumping the minimum GCC version for
KASAN builds to gcc-8.3, the reason for this is that the
__no_sanitize_address function attribute is broken in GCC releases
before that.
No known GCC version has a working __no_sanitize_undefined, however
because the only noinstr violation that results from this happens
when an UB is found, we treat it like WARN. That is, we allow it to
violate the noinstr rules in order to get the warning out'"
* tag 'x86_entry_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry: Fix #UD vs WARN more
x86/entry: Increase entry_stack size to a full page
x86/entry: Fixup bad_iret vs noinstr
objtool: Don't consider vmlinux a C-file
kasan: Fix required compiler version
compiler_attributes.h: Support no_sanitize_undefined check with GCC 4
x86/entry, bug: Comment the instrumentation_begin() usage for WARN()
x86/entry, ubsan, objtool: Whitelist __ubsan_handle_*()
x86/entry, cpumask: Provide non-instrumented variant of cpu_is_offline()
compiler_types.h: Add __no_sanitize_{address,undefined} to noinstr
kasan: Bump required compiler version
x86, kcsan: Add __no_kcsan to noinstr
kcsan: Remove __no_kcsan_or_inline
x86, kcsan: Remove __no_kcsan_or_inline usage
A fix for a crash in nested KVM when CONFIG_DEBUG_VIRTUAL=y.
Two minor build fixes.
Thanks to:
Aneesh Kumar K.V, Arseny Solokha, Harish.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl73L58THG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFHvD/9InTRjjJ8VPoZri5d4UNP1xPcAMF6P
u9KppOVJDHaRgI58xtWgfH3eXVg90OY/u6nVAfaXIlrhH5BCEnx/+gRRoQXBR8vE
MkoGevVCo6/2FpzAQHAQpaz5BrqipMA0EGBrUvK4+doWJf+lCcuTu4pXCawlbjr+
Hbc4NChEuqLYkEnWidQL+MxAoJH9Z1HiXmilWRj+8rG0HnMMH6oqDurxUhGmA6/I
1LHarXkPeWoxx1UQLMN5kDsTLdxCLmz9oqG5EON2vQjob7Zu8sv4WvEZjx7qibHy
yEe1YUO7xqv6OatGyWvF6FXq6H4udK1djlvFz4FBjOg6YPusrmSIDT5JLth3Q9Fi
P8Z1DsiHAljgq3bfwIFdFQRG35JAzV1LmEI89z6VADGPPZuIwUCoagnfnQUBaPfg
VTlnYJlUEmCDAH6R/OSUIOpVqmblCukeSIObvigcsyamPrAZZKl0PPw3ns+Uzykw
WJ1ukCyppscPwZ00+hByUXOmX6ZBMHoFakgpd3b09RUHiMbrjvpgaUFEwqSl1AsA
yJ83uqaywqjewKH0uMwqRtw7j/BfXa47hYYqNeBcxqu9h5ORMchAIB4rgbO2LDgr
ZsvhiSlBqEL8j5HnzVYEaFPniE6NWesx1JJLOD41fWN+Fa9SFzkVfZy6rQyhOLU8
rjQQ+iJJVFreqw==
=TvGN
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- A fix for a crash in nested KVM when CONFIG_DEBUG_VIRTUAL=y.
- Two minor build fixes.
Thanks to: Aneesh Kumar K.V, Arseny Solokha, Harish.
* tag 'powerpc-5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Fix build failure in ebb tests
powerpc/kvm/book3s64: Fix kernel crash with nested kvm & DEBUG_VIRTUAL
powerpc/fsl_booke/32: Fix build with CONFIG_RANDOMIZE_BASE
- Fix unwinding through vDSO sigreturn trampoline
- Fix build warnings by raising minimum LD version for PAC
- Whitelist some Kryo Cortex-A55 derivatives for Meltdown and SSB
- Fix perf register PC reporting for compat tasks
- Fix 'make clean' warning for arm64 signal selftests
- Fix ftrace when BTI is compiled in
- Avoid building the compat vDSO using GCC plugins
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl71tgEQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNB5sB/48VLEeDtkRtHVQntLG9SFogwDkHjkRW/lo
kgO5APEcdhZZq3mBY2fIww5iX5Et7vRpx8ovempmqZGhO9B4ZMSNG0DFxoYdtXTU
jgox+LzkW+hYldK1Bv03ioLZgIz6Lc8zyK6kRB7NuDN88VEVds0ksYmcAojeIN9b
vmpquEAoVppm0VPjt6VA0xQ6HtiKfvlV7PW6Pqs0dKovnNY982jRXBMzaGBbDFQ7
3eKmW4PBru/Ew16J172vf/0sBJQBiZrSdXCqv/USKvPHkUDkJiYsaWLpsWx4m4to
bE/OS6aWx94NcgxPUca3y2G2OhPU+VFiXjuJ0kvzt4EJIuW/CGUf
=2kBR
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The big fix here is to our vDSO sigreturn trampoline as, after a
painfully long stint of debugging, it turned out that fixing some of
our CFI directives in the merge window lit up a bunch of logic in
libgcc which has been shown to SEGV in some cases during asynchronous
pthread cancellation.
It looks like we can fix this by extending the directives to restore
most of the interrupted register state from the sigcontext, but it's
risky and hard to test so we opted to remove the CFI directives for
now and rely on the unwinder fallback path like we used to.
- Fix unwinding through vDSO sigreturn trampoline
- Fix build warnings by raising minimum LD version for PAC
- Whitelist some Kryo Cortex-A55 derivatives for Meltdown and SSB
- Fix perf register PC reporting for compat tasks
- Fix 'make clean' warning for arm64 signal selftests
- Fix ftrace when BTI is compiled in
- Avoid building the compat vDSO using GCC plugins"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Add KRYO{3,4}XX silver CPU cores to SSB safelist
arm64: perf: Report the PC value in REGS_ABI_32 mode
kselftest: arm64: Remove redundant clean target
arm64: kpti: Add KRYO{3, 4}XX silver CPU cores to kpti safelist
arm64: Don't insert a BTI instruction at inner labels
arm64: vdso: Don't use gcc plugins for building vgettimeofday.c
arm64: vdso: Only pass --no-eh-frame-hdr when linker supports it
arm64: Depend on newer binutils when building PAC
arm64: compat: Remove 32-bit sigreturn code from the vDSO
arm64: compat: Always use sigpage for sigreturn trampoline
arm64: compat: Allow 32-bit vdso and sigpage to co-exist
arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline
When separating out different phases of running tests[1]
(build/exec/parse/etc), the format of the KunitResult tuple changed
(adding an elapsed_time variable). This is not populated during a build
failure, causing kunit.py to crash.
This fixes [1] to probably populate the result variable, causing a
failing build to be reported properly.
[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=45ba7a893ad89114e773b3dc32f6431354c465d6
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Currently, if the kernel is configured incorrectly or if it crashes before any
kunit tests are run, kunit finishes without error, reporting
that 0 test cases were run.
To fix this, an error is shown when the tap header is not found, which
indicates that kunit was not able to run at all.
Signed-off-by: Uriel Guajardo <urielguajardo@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Commit 8b59cd81dc ("kbuild: ensure full rebuild when the compiler is
updated") introduced a new CONFIG option CONFIG_CC_VERSION_TEXT. On my
system, this is set to "gcc (GCC) 10.1.0" which breaks KUnit config
parsing which did not like the spaces in the string.
Fix this by updating the regex to allow strings containing spaces.
Fixes: 8b59cd81dc ("kbuild: ensure full rebuild when the compiler is updated")
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
We use OUTPUT directory as TMPOUT for checking no-pie option.
Since commit f2f02ebd8f ("kbuild: improve cc-option to clean up all
temporary files") when building powerpc/ from selftests directory, the
OUTPUT directory points to powerpc/pmu/ebb/ and gets removed when
checking for -no-pie option in try-run routine, subsequently build
fails with the following:
$ make -C powerpc
...
TARGET=ebb; BUILD_TARGET=$OUTPUT/$TARGET; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C $TARGET all
make[2]: Entering directory '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb'
make[2]: *** No rule to make target 'Makefile'.
make[2]: Failed to remake makefile 'Makefile'.
make[2]: *** No rule to make target 'ebb.c', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'ebb_handler.S', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'trace.c', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'busy_loop.S', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: Target 'all' not remade because of errors.
Fix this by adding a suffix to the OUTPUT directory so that the
failure is avoided.
Fixes: 9686813f6e ("selftests/powerpc: Fix try-run when source tree is not writable")
Signed-off-by: Harish <harish@linux.ibm.com>
[mpe: Mention that commit that triggered the breakage]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200625165721.264904-1-harish@linux.ibm.com
Minor overlapping changes in xfrm_device.c, between the double
ESP trailing bug fix setting the XFRM_INIT flag and the changes
in net-next preparing for bonding encryption support.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Don't insert ESP trailer twice in IPSEC code, from Huy Nguyen.
2) The default crypto algorithm selection in Kconfig for IPSEC is out
of touch with modern reality, fix this up. From Eric Biggers.
3) bpftool is missing an entry for BPF_MAP_TYPE_RINGBUF, from Andrii
Nakryiko.
4) Missing init of ->frame_sz in xdp_convert_zc_to_xdp_frame(), from
Hangbin Liu.
5) Adjust packet alignment handling in ax88179_178a driver to match
what the hardware actually does. From Jeremy Kerr.
6) register_netdevice can leak in the case one of the notifiers fail,
from Yang Yingliang.
7) Use after free in ip_tunnel_lookup(), from Taehee Yoo.
8) VLAN checks in sja1105 DSA driver need adjustments, from Vladimir
Oltean.
9) tg3 driver can sleep forever when we get enough EEH errors, fix from
David Christensen.
10) Missing {READ,WRITE}_ONCE() annotations in various Intel ethernet
drivers, from Ciara Loftus.
11) Fix scanning loop break condition in of_mdiobus_register(), from
Florian Fainelli.
12) MTU limit is incorrect in ibmveth driver, from Thomas Falcon.
13) Endianness fix in mlxsw, from Ido Schimmel.
14) Use after free in smsc95xx usbnet driver, from Tuomas Tynkkynen.
15) Missing bridge mrp configuration validation, from Horatiu Vultur.
16) Fix circular netns references in wireguard, from Jason A. Donenfeld.
17) PTP initialization on recovery is not done properly in qed driver,
from Alexander Lobakin.
18) Endian conversion of L4 ports in filters of cxgb4 driver is wrong,
from Rahul Lakkireddy.
19) Don't clear bound device TX queue of socket prematurely otherwise we
get problems with ktls hw offloading, from Tariq Toukan.
20) ipset can do atomics on unaligned memory, fix from Russell King.
21) Align ethernet addresses properly in bridging code, from Thomas
Martitz.
22) Don't advertise ipv4 addresses on SCTP sockets having ipv6only set,
from Marcelo Ricardo Leitner.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (149 commits)
rds: transport module should be auto loaded when transport is set
sch_cake: fix a few style nits
sch_cake: don't call diffserv parsing code when it is not needed
sch_cake: don't try to reallocate or unshare skb unconditionally
ethtool: fix error handling in linkstate_prepare_data()
wil6210: account for napi_gro_receive never returning GRO_DROP
hns: do not cast return value of napi_gro_receive to null
socionext: account for napi_gro_receive never returning GRO_DROP
wireguard: receive: account for napi_gro_receive never returning GRO_DROP
vxlan: fix last fdb index during dump of fdb with nhid
sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
tc-testing: avoid action cookies with odd length.
bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
net: dsa: sja1105: fix tc-gate schedule with single element
net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules
net: dsa: sja1105: unconditionally free old gating config
net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top
net: macb: free resources on failure path of at91ether_open()
net: macb: call pm_runtime_put_sync on failure path
...
Update odd length cookie hexstrings in csum.json, tunnel_key.json and
bpf.json to be even length to comply with check enforced in commit
0149dabf2a1b ("tc: m_actions: check cookie hexstring len") in iproute2.
Signed-off-by: Briana Oursler <briana.oursler@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Apply the fix from:
"tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT"
to the BPF implementation of TCP CUBIC congestion control.
Repeating the commit description here for completeness:
Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where
Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an
ACK when the minimum rtt of a connection goes down. From inspection it
is clear from the existing code that this could happen in an example
like the following:
o The first 8 RTT samples in a round trip are 150ms, resulting in a
curr_rtt of 150ms and a delay_min of 150ms.
o The 9th RTT sample is 100ms. The curr_rtt does not change after the
first 8 samples, so curr_rtt remains 150ms. But delay_min can be
lowered at any time, so delay_min falls to 100ms. The code executes
the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min
of 100ms, and the curr_rtt is declared far enough above delay_min to
force a (spurious) exit of Slow start.
The fix here is simple: allow every RTT sample in a round trip to
lower the curr_rtt.
Fixes: 6de4a9c430 ("bpf: tcp: Add bpf_cubic example")
Reported-by: Mirja Kuehlewind <mirja.kuehlewind@ericsson.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust the SEC("xdp_devmap/") prog type prefix to contain a
slash "/" for expected attach type BPF_XDP_DEVMAP. This is consistent
with other prog types like tracing.
Fixes: 2778797037 ("libbpf: Add SEC name for xdp programs attached to device map")
Suggested-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159309521882.821855.6873145686353617509.stgit@firesoul
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net, they are:
1) Unaligned atomic access in ipset, from Russell King.
2) Missing module description, from Rob Gill.
3) Patches to fix a module unload causing NULL pointer dereference in
xtables, from David Wilder. For the record, I posting here his cover
letter explaining the problem:
A crash happened on ppc64le when running ltp network tests triggered by
"rmmod iptable_mangle".
See previous discussion in this thread:
https://lists.openwall.net/netdev/2020/06/03/161 .
In the crash I found in iptable_mangle_hook() that
state->net->ipv4.iptable_mangle=NULL causing a NULL pointer dereference.
net->ipv4.iptable_mangle is set to NULL in +iptable_mangle_net_exit() and
called when ip_mangle modules is unloaded. A rmmod task was found running
in the crash dump. A 2nd crash showed the same problem when running
"rmmod iptable_filter" (net->ipv4.iptable_filter=NULL).
To fix this I added .pre_exit hook in all iptable_foo.c. The pre_exit will
un-register the underlying hook and exit would do the table freeing. The
netns core does an unconditional +synchronize_rcu after the pre_exit hooks
insuring no packets are in flight that have picked up the pointer before
completing the un-register.
These patches include changes for both iptables and ip6tables.
We tested this fix with ltp running iptables01.sh and iptables01.sh -6 a
loop for 72 hours.
4) Add a selftest for conntrack helper assignment, from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Define attach_type_name in common.c instead of main.h so it is only
defined once. This leads to a slight decrease in the binary size of
bpftool.
Before:
text data bss dec hex filename
399024 11168 1573160 1983352 1e4378 bpftool
After:
text data bss dec hex filename
398256 10880 1573160 1982296 1e3f58 bpftool
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200624143154.13145-1-tklauser@distanz.ch
Define prog_type_name in prog.c instead of main.h so it is only defined
once. This leads to a slight decrease in the binary size of bpftool.
Before:
text data bss dec hex filename
401032 11936 1573160 1986128 1e4e50 bpftool
After:
text data bss dec hex filename
399024 11168 1573160 1983352 1e4378 bpftool
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200624143124.12914-1-tklauser@distanz.ch
These newly added macros will be used in subsequent bpf iterator
tcp{4,6} and udp{4,6} programs.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230819.3989050-1-yhs@fb.com
Refactor bpf_iter_ipv6_route.c and bpf_iter_netlink.c
so net macros, originally from various include/linux header
files, are moved to a new header file
bpf_tracing_net.h. The goal is to improve reuse so
networking tracing programs do not need to
copy these macros every time they use them.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230817.3988962-1-yhs@fb.com
Commit b9f4c01f3e ("selftest/bpf: Make bpf_iter selftest
compilable against old vmlinux.h") and Commit dda18a5c0b
("selftests/bpf: Convert bpf_iter_test_kern{3, 4}.c to define
own bpf_iter_meta") redefined newly introduced types
in bpf programs so the bpf program can still compile
properly with old kernels although loading may fail.
Since this patch set introduced new types and the same
workaround is needed, so let us move the workaround
to a separate header file so they do not clutter
bpf programs.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230816.3988656-1-yhs@fb.com
The helper is used in tracing programs to cast a socket
pointer to a udp6_sock pointer.
The return value could be NULL if the casting is illegal.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20200623230815.3988481-1-yhs@fb.com
Three more helpers are added to cast a sock_common pointer to
an tcp_sock, tcp_timewait_sock or a tcp_request_sock for
tracing programs.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230811.3988277-1-yhs@fb.com
The helper is used in tracing programs to cast a socket
pointer to a tcp6_sock pointer.
The return value could be NULL if the casting is illegal.
A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added
so the verifier is able to deduce proper return types for the helper.
Different from the previous BTF_ID based helpers,
the bpf_skc_to_tcp6_sock() argument can be several possible
btf_ids. More specifically, all possible socket data structures
with sock_common appearing in the first in the memory layout.
This patch only added socket types related to tcp and udp.
All possible argument btf_id and return value btf_id
for helper bpf_skc_to_tcp6_sock() are pre-calculcated and
cached. In the future, it is even possible to precompute
these btf_id's at kernel build time.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com
check that 'nft ... ct helper set <foo>' works:
1. configure ftp helper via nft and assign it to
connections on port 2121
2. check with 'conntrack -L' that the next connection
has the ftp helper attached to it.
Also add a test for auto-assign (old behaviour).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes all over the place.
This includes a couple of tests that I would normally defer,
but since they have already been helpful in catching some bugs,
don't build for any users at all, and having them
upstream makes life easier for everyone, I think it's
ok even at this late stage.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl7w3hcPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpJ6UH/jVFVzILp3M5eVCewNsS2/j5HvRHxKb4DTwh
eoNTYMSLMy5UnwOdNZ3RjmNHxc8hS04fQsHZpI6mAIPSLWdswjmRwQPN03/fvP9M
nVdy3SxyU2ZzHbX/hRi3eTwhOkOY4RcyvkOjw49LLokIX6/DzsAO5ql3NAZW9vbL
LqZQKzAb7BeltH5RWbPPdF2UCZKia/x9QYHIXh0gJZj5YaAzlvJywNDQcxuiIODb
ZmHZZBaqZ66n99hWnSKfHtqX/fqRDqrdyameIVaSwgewho2AMj9V1FvvVMJWm669
PjQuHDf03TBBDNP9UwJRjdko7qA6aqX5Tef9aGMyBy2LJUD12A8=
=Z/dQ
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Fixes all over the place.
This includes a couple of tests that I would normally defer, but since
they have already been helpful in catching some bugs, don't build for
any users at all, and having them upstream makes life easier for
everyone, I think it's ok even at this late stage"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
tools/virtio: Use tools/include/list.h instead of stubs
tools/virtio: Reset index in virtio_test --reset.
tools/virtio: Extract virtqueue initialization in vq_reset
tools/virtio: Use __vring_new_virtqueue in virtio_test.c
tools/virtio: Add --reset
tools/virtio: Add --batch=random option
tools/virtio: Add --batch option
virtio-mem: add memory via add_memory_driver_managed()
virtio-mem: silence a static checker warning
vhost_vdpa: Fix potential underflow in vhost_vdpa_mmap()
vdpa: fix typos in the comments for __vdpa_alloc_device()
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXvNBJgAKCRCRxhvAZXjc
oulGAPoCPfCguA8TPcy4tq4byGPoThyO4XnWR6XcUDOEzhbzzAEA+s5S7iRV8W92
p2gzbI4Kncq4dQNEtUvfPHQZDAEwTA0=
=eZDz
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2020-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull thread fix from Christian Brauner:
"This fixes a regression introduced with 303cc571d1 ("nsproxy: attach
to namespaces via pidfds").
The LTP testsuite reported a regression where users would now see
EBADF returned instead of EINVAL when an fd was passed that referred
to an open file but the file was not a namespace file.
Fix this by continuing to report EINVAL and add a regression test"
* tag 'for-linus-2020-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
tests: test for setns() EINVAL regression
nsproxy: restore EINVAL for non-namespace file descriptor
This patch adds support of SO_KEEPALIVE flag and TCP related options
to bpf_setsockopt() routine. This is helpful if we want to enable or tune
TCP keepalive for applications which don't do it in the userspace code.
v3:
- update kernel-doc in uapi (Nikita Vetoshkin <nekto0n@yandex-team.ru>)
v4:
- update kernel-doc in tools too (Alexei Starovoitov)
- add test to selftests (Alexei Starovoitov)
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200620153052.9439-3-zeil@yandex-team.ru
./test_progs-no_alu32 -t get_stack_raw_tp
fails due to:
52: (85) call bpf_get_stack#67
53: (bf) r8 = r0
54: (bf) r1 = r8
55: (67) r1 <<= 32
56: (c7) r1 s>>= 32
; if (usize < 0)
57: (c5) if r1 s< 0x0 goto pc+26
R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800) R9=inv800
; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
58: (1f) r9 -= r8
; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
59: (bf) r2 = r7
60: (0f) r2 += r1
regs=1 stack=0 before 52: (85) call bpf_get_stack#67
; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
61: (bf) r1 = r6
62: (bf) r3 = r9
63: (b7) r4 = 0
64: (85) call bpf_get_stack#67
R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=800,var_off=(0x0; 0x3ff),s32_max_value=1023,u32_max_value=1023) R3_w=inv(id=0,umax_value=9223372036854776608)
R3 unbounded memory access, use 'var &= const' or 'if (var < const)'
In the C code:
usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
if (usize < 0)
return 0;
ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
if (ksize < 0)
return 0;
We used to have problem with pointer arith in R2.
Now it's a problem with two integers in R3.
'if (usize < 0)' is comparing R1 and makes it [0,800], but R8 stays [-inf,800].
Both registers represent the same 'usize' variable.
Then R9 -= R8 is doing 800 - [-inf, 800]
so the result of "max_len - usize" looks unbounded to the verifier while
it's obvious in C code that "max_len - usize" should be [0, 800].
To workaround the problem convert ksize and usize variables from int to long.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Prevent loading/parsing vmlinux BTF twice in some cases: for CO-RE relocations
and for BTF-aware hooks (tp_btf, fentry/fexit, etc).
Fixes: a6ed02cac6 ("libbpf: Load btf_vmlinux only once per object.")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200624043805.1794620-1-andriin@fb.com
Building bpftool yields the following complaint:
pids.c: In function 'emit_obj_refs_json':
pids.c:175:80: warning: declaration of 'json_wtr' shadows a global declaration [-Wshadow]
175 | void emit_obj_refs_json(struct obj_refs_table *table, __u32 id, json_writer_t *json_wtr)
| ~~~~~~~~~~~~~~~^~~~~~~~
In file included from pids.c:11:
main.h:141:23: note: shadowed declaration is here
141 | extern json_writer_t *json_wtr;
| ^~~~~~~~
Let's rename the variable.
v2:
- Rename the variable instead of calling the global json_wtr directly.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623213600.16643-1-quentin@isovalent.com
The arm64 signal tests generate warnings during build since both they and
the toplevel lib.mk define a clean target:
Makefile:25: warning: overriding recipe for target 'clean'
../../lib.mk:126: warning: ignoring old recipe for target 'clean'
Since the inclusion of lib.mk is in the signal Makefile there is no
situation where this warning could be avoided so just remove the redundant
clean target.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200624104933.21125-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Run rxtimestamp as part of TEST_PROGS. Analogous to other tests, add
new rxtimestamp.sh wrapper script, so that the test runs isolated
from background traffic in a private network namespace.
Also ignore failures of test case #6 by default. This case verifies
that a receive timestamp is not reported if timestamp reporting is
enabled for a socket, but generation is disabled. Receive timestamp
generation has to be enabled globally, as no associated socket is
known yet. A background process that enables rx timestamp generation
therefore causes a false positive. Ntpd is one example that does.
Add a "--strict" option to cause failure in the event that any test
case fails, including test #6. This is useful for environments that
are known to not have such background processes.
Tested:
make -C tools/testing/selftests TARGETS="net" run_tests
Signed-off-by: Tanner Love <tannerlove@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When producing the bpf-helpers.7 man page from the documentation from
the BPF user space header file, rst2man complains:
<stdin>:2636: (ERROR/3) Unexpected indentation.
<stdin>:2640: (WARNING/2) Block quote ends without a blank line; unexpected unindent.
Let's fix formatting for the relevant chunk (item list in
bpf_ringbuf_query()'s description), and for a couple other functions.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623153935.6215-1-quentin@isovalent.com
bpf_object__find_program_by_title(), used by CO-RE relocation code, doesn't
return .text "BPF program", if it is a function storage for sub-programs.
Because of that, any CO-RE relocation in helper non-inlined functions will
fail. Fix this by searching for .text-corresponding BPF program manually.
Adjust one of bpf_iter selftest to exhibit this pattern.
Fixes: ddc7c30426 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200619230423.691274-1-andriin@fb.com
Currently, if the clang-bpf-co-re feature is not available, the build
fails with e.g.
CC prog.o
prog.c:1462:10: fatal error: profiler.skel.h: No such file or directory
1462 | #include "profiler.skel.h"
| ^~~~~~~~~~~~~~~~~
This is due to the fact that the BPFTOOL_WITHOUT_SKELETONS macro is not
defined, despite BUILD_BPF_SKELS not being set. Fix this by correctly
evaluating $(BUILD_BPF_SKELS) when deciding on whether to add
-DBPFTOOL_WITHOUT_SKELETONS to CFLAGS.
Fixes: 05aca6da3b ("tools/bpftool: Generalize BPF skeleton support and generate vmlinux.h")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200623103710.10370-1-tklauser@distanz.ch
Extend original variable-length tests with a case to catch a common
existing pattern of testing for < 0 for errors. Note because
verifier also tracks upper bounds and we know it can not be greater
than MAX_LEN here we can skip upper bound check.
In ALU64 enabled compilation converting from long->int return types
in probe helpers results in extra instruction pattern, <<= 32, s >>= 32.
The trade-off is the non-ALU64 case works. If you really care about
every extra insn (XDP case?) then you probably should be using original
int type.
In addition adding a sext insn to bpf might help the verifier in the
general case to avoid these types of tricks.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200623032224.4020118-3-andriin@fb.com
Add selftest that validates variable-length data reading and concatentation
with one big shared data array. This is a common pattern in production use for
monitoring and tracing applications, that potentially can read a lot of data,
but overall read much less. Such pattern allows to determine precisely what
amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200623032224.4020118-2-andriin@fb.com
Switch most of BPF helper definitions from returning int to long. These
definitions are coming from comments in BPF UAPI header and are used to
generate bpf_helper_defs.h (under libbpf) to be later included and used from
BPF programs.
In actual in-kernel implementation, all the helpers are defined as returning
u64, but due to some historical reasons, most of them are actually defined as
returning int in UAPI (usually, to return 0 on success, and negative value on
error).
This actually causes Clang to quite often generate sub-optimal code, because
compiler believes that return value is 32-bit, and in a lot of cases has to be
up-converted (usually with a pair of 32-bit bit shifts) to 64-bit values,
before they can be used further in BPF code.
Besides just "polluting" the code, these 32-bit shifts quite often cause
problems for cases in which return value matters. This is especially the case
for the family of bpf_probe_read_str() functions. There are few other similar
helpers (e.g., bpf_read_branch_records()), in which return value is used by
BPF program logic to record variable-length data and process it. For such
cases, BPF program logic carefully manages offsets within some array or map to
read variable-length data. For such uses, it's crucial for BPF verifier to
track possible range of register values to prove that all the accesses happen
within given memory bounds. Those extraneous zero-extending bit shifts,
inserted by Clang (and quite often interleaved with other code, which makes
the issues even more challenging and sometimes requires employing extra
per-variable compiler barriers), throws off verifier logic and makes it mark
registers as having unknown variable offset. We'll study this pattern a bit
later below.
Another common pattern is to check return of BPF helper for non-zero state to
detect error conditions and attempt alternative actions in such case. Even in
this simple and straightforward case, this 32-bit vs BPF's native 64-bit mode
quite often leads to sub-optimal and unnecessary extra code. We'll look at
this pattern as well.
Clang's BPF target supports two modes of code generation: ALU32, in which it
is capable of using lower 32-bit parts of registers, and no-ALU32, in which
only full 64-bit registers are being used. ALU32 mode somewhat mitigates the
above described problems, but not in all cases.
This patch switches all the cases in which BPF helpers return 0 or negative
error from returning int to returning long. It is shown below that such change
in definition leads to equivalent or better code. No-ALU32 mode benefits more,
but ALU32 mode doesn't degrade or still gets improved code generation.
Another class of cases switched from int to long are bpf_probe_read_str()-like
helpers, which encode successful case as non-negative values, while still
returning negative value for errors.
In all of such cases, correctness is preserved due to two's complement
encoding of negative values and the fact that all helpers return values with
32-bit absolute value. Two's complement ensures that for negative values
higher 32 bits are all ones and when truncated, leave valid negative 32-bit
value with the same value. Non-negative values have upper 32 bits set to zero
and similarly preserve value when high 32 bits are truncated. This means that
just casting to int/u32 is correct and efficient (and in ALU32 mode doesn't
require any extra shifts).
To minimize the chances of regressions, two code patterns were investigated,
as mentioned above. For both patterns, BPF assembly was analyzed in
ALU32/NO-ALU32 compiler modes, both with current 32-bit int return type and
new 64-bit long return type.
Case 1. Variable-length data reading and concatenation. This is quite
ubiquitous pattern in tracing/monitoring applications, reading data like
process's environment variables, file path, etc. In such case, many pieces of
string-like variable-length data are read into a single big buffer, and at the
end of the process, only a part of array containing actual data is sent to
user-space for further processing. This case is tested in test_varlen.c
selftest (in the next patch). Code flow is roughly as follows:
void *payload = &sample->payload;
u64 len;
len = bpf_probe_read_kernel_str(payload, MAX_SZ1, &source_data1);
if (len <= MAX_SZ1) {
payload += len;
sample->len1 = len;
}
len = bpf_probe_read_kernel_str(payload, MAX_SZ2, &source_data2);
if (len <= MAX_SZ2) {
payload += len;
sample->len2 = len;
}
/* and so on */
sample->total_len = payload - &sample->payload;
/* send over, e.g., perf buffer */
There could be two variations with slightly different code generated: when len
is 64-bit integer and when it is 32-bit integer. Both variations were analysed.
BPF assembly instructions between two successive invocations of
bpf_probe_read_kernel_str() were used to check code regressions. Results are
below, followed by short analysis. Left side is using helpers with int return
type, the right one is after the switch to long.
ALU32 + INT ALU32 + LONG
=========== ============
64-BIT (13 insns): 64-BIT (10 insns):
------------------------------------ ------------------------------------
17: call 115 17: call 115
18: if w0 > 256 goto +9 <LBB0_4> 18: if r0 > 256 goto +6 <LBB0_4>
19: w1 = w0 19: r1 = 0 ll
20: r1 <<= 32 21: *(u64 *)(r1 + 0) = r0
21: r1 s>>= 32 22: r6 = 0 ll
22: r2 = 0 ll 24: r6 += r0
24: *(u64 *)(r2 + 0) = r1 00000000000000c8 <LBB0_4>:
25: r6 = 0 ll 25: r1 = r6
27: r6 += r1 26: w2 = 256
00000000000000e0 <LBB0_4>: 27: r3 = 0 ll
28: r1 = r6 29: call 115
29: w2 = 256
30: r3 = 0 ll
32: call 115
32-BIT (11 insns): 32-BIT (12 insns):
------------------------------------ ------------------------------------
17: call 115 17: call 115
18: if w0 > 256 goto +7 <LBB1_4> 18: if w0 > 256 goto +8 <LBB1_4>
19: r1 = 0 ll 19: r1 = 0 ll
21: *(u32 *)(r1 + 0) = r0 21: *(u32 *)(r1 + 0) = r0
22: w1 = w0 22: r0 <<= 32
23: r6 = 0 ll 23: r0 >>= 32
25: r6 += r1 24: r6 = 0 ll
00000000000000d0 <LBB1_4>: 26: r6 += r0
26: r1 = r6 00000000000000d8 <LBB1_4>:
27: w2 = 256 27: r1 = r6
28: r3 = 0 ll 28: w2 = 256
30: call 115 29: r3 = 0 ll
31: call 115
In ALU32 mode, the variant using 64-bit length variable clearly wins and
avoids unnecessary zero-extension bit shifts. In practice, this is even more
important and good, because BPF code won't need to do extra checks to "prove"
that payload/len are within good bounds.
32-bit len is one instruction longer. Clang decided to do 64-to-32 casting
with two bit shifts, instead of equivalent `w1 = w0` assignment. The former
uses extra register. The latter might potentially lose some range information,
but not for 32-bit value. So in this case, verifier infers that r0 is [0, 256]
after check at 18:, and shifting 32 bits left/right keeps that range intact.
We should probably look into Clang's logic and see why it chooses bitshifts
over sub-register assignments for this.
NO-ALU32 + INT NO-ALU32 + LONG
============== ===============
64-BIT (14 insns): 64-BIT (10 insns):
------------------------------------ ------------------------------------
17: call 115 17: call 115
18: r0 <<= 32 18: if r0 > 256 goto +6 <LBB0_4>
19: r1 = r0 19: r1 = 0 ll
20: r1 >>= 32 21: *(u64 *)(r1 + 0) = r0
21: if r1 > 256 goto +7 <LBB0_4> 22: r6 = 0 ll
22: r0 s>>= 32 24: r6 += r0
23: r1 = 0 ll 00000000000000c8 <LBB0_4>:
25: *(u64 *)(r1 + 0) = r0 25: r1 = r6
26: r6 = 0 ll 26: r2 = 256
28: r6 += r0 27: r3 = 0 ll
00000000000000e8 <LBB0_4>: 29: call 115
29: r1 = r6
30: r2 = 256
31: r3 = 0 ll
33: call 115
32-BIT (13 insns): 32-BIT (13 insns):
------------------------------------ ------------------------------------
17: call 115 17: call 115
18: r1 = r0 18: r1 = r0
19: r1 <<= 32 19: r1 <<= 32
20: r1 >>= 32 20: r1 >>= 32
21: if r1 > 256 goto +6 <LBB1_4> 21: if r1 > 256 goto +6 <LBB1_4>
22: r2 = 0 ll 22: r2 = 0 ll
24: *(u32 *)(r2 + 0) = r0 24: *(u32 *)(r2 + 0) = r0
25: r6 = 0 ll 25: r6 = 0 ll
27: r6 += r1 27: r6 += r1
00000000000000e0 <LBB1_4>: 00000000000000e0 <LBB1_4>:
28: r1 = r6 28: r1 = r6
29: r2 = 256 29: r2 = 256
30: r3 = 0 ll 30: r3 = 0 ll
32: call 115 32: call 115
In NO-ALU32 mode, for the case of 64-bit len variable, Clang generates much
superior code, as expected, eliminating unnecessary bit shifts. For 32-bit
len, code is identical.
So overall, only ALU-32 32-bit len case is more-or-less equivalent and the
difference stems from internal Clang decision, rather than compiler lacking
enough information about types.
Case 2. Let's look at the simpler case of checking return result of BPF helper
for errors. The code is very simple:
long bla;
if (bpf_probe_read_kenerl(&bla, sizeof(bla), 0))
return 1;
else
return 0;
ALU32 + CHECK (9 insns) ALU32 + CHECK (9 insns)
==================================== ====================================
0: r1 = r10 0: r1 = r10
1: r1 += -8 1: r1 += -8
2: w2 = 8 2: w2 = 8
3: r3 = 0 3: r3 = 0
4: call 113 4: call 113
5: w1 = w0 5: r1 = r0
6: w0 = 1 6: w0 = 1
7: if w1 != 0 goto +1 <LBB2_2> 7: if r1 != 0 goto +1 <LBB2_2>
8: w0 = 0 8: w0 = 0
0000000000000048 <LBB2_2>: 0000000000000048 <LBB2_2>:
9: exit 9: exit
Almost identical code, the only difference is the use of full register
assignment (r1 = r0) vs half-registers (w1 = w0) in instruction #5. On 32-bit
architectures, new BPF assembly might be slightly less optimal, in theory. But
one can argue that's not a big issue, given that use of full registers is
still prevalent (e.g., for parameter passing).
NO-ALU32 + CHECK (11 insns) NO-ALU32 + CHECK (9 insns)
==================================== ====================================
0: r1 = r10 0: r1 = r10
1: r1 += -8 1: r1 += -8
2: r2 = 8 2: r2 = 8
3: r3 = 0 3: r3 = 0
4: call 113 4: call 113
5: r1 = r0 5: r1 = r0
6: r1 <<= 32 6: r0 = 1
7: r1 >>= 32 7: if r1 != 0 goto +1 <LBB2_2>
8: r0 = 1 8: r0 = 0
9: if r1 != 0 goto +1 <LBB2_2> 0000000000000048 <LBB2_2>:
10: r0 = 0 9: exit
0000000000000058 <LBB2_2>:
11: exit
NO-ALU32 is a clear improvement, getting rid of unnecessary zero-extension bit
shifts.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200623032224.4020118-1-andriin@fb.com
Before, we took a reference to the creating netns if the new netns was
different. This caused issues with circular references, with two
wireguard interfaces swapping namespaces. The solution is to rather not
take any extra references at all, but instead simply invalidate the
creating netns pointer when that netns is deleted.
In order to prevent this from happening again, this commit improves the
rough object leak tracking by allowing it to account for created and
destroyed interfaces, aside from just peers and keys. That then makes it
possible to check for the object leak when having two interfaces take a
reference to each others' namespaces.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add statements about bpftool being able to discover process info, holding
reference to BPF map, prog, link, or BTF. Show example output as well.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-10-andriin@fb.com
Adapt Makefile to support BPF skeleton generation beyond single profiler.bpf.c
case. Also add vmlinux.h generation and switch profiler.bpf.c to use it.
clang-bpf-global-var feature is extended and renamed to clang-bpf-co-re to
check for support of preserve_access_index attribute, which, together with BTF
for global variables, is the minimum requirement for modern BPF programs.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-7-andriin@fb.com
Build minimal "bootstrap mode" bpftool to enable skeleton (and, later,
vmlinux.h generation), instead of building almost complete, but slightly
different (w/o skeletons, etc) bpftool to bootstrap complete bpftool build.
Current approach doesn't scale well (engineering-wise) when adding more BPF
programs to bpftool and other complicated functionality, as it requires
constant adjusting of the code to work in both bootstrapped mode and normal
mode.
So it's better to build only minimal bpftool version that supports only BPF
skeleton code generation and BTF-to-C conversion. Thankfully, this is quite
easy to accomplish due to internal modularity of bpftool commands. This will
also allow to keep adding new functionality to bpftool in general, without the
need to care about bootstrap mode for those new parts of bpftool.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-6-andriin@fb.com
Move functions that parse map and prog by id/tag/name/etc outside of
map.c/prog.c, respectively. These functions are used outside of those files
and are generic enough to be in common. This also makes heavy-weight map.c and
prog.c more decoupled from the rest of bpftool files and facilitates more
lightweight bootstrap bpftool variant.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-5-andriin@fb.com
Validate libbpf is able to handle weak and strong kernel symbol externs in BPF
code correctly.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-4-andriin@fb.com
Add support for another (in addition to existing Kconfig) special kind of
externs in BPF code, kernel symbol externs. Such externs allow BPF code to
"know" kernel symbol address and either use it for comparisons with kernel
data structures (e.g., struct file's f_op pointer, to distinguish different
kinds of file), or, with the help of bpf_probe_user_kernel(), to follow
pointers and read data from global variables. Kernel symbol addresses are
found through /proc/kallsyms, which should be present in the system.
Currently, such kernel symbol variables are typeless: they have to be defined
as `extern const void <symbol>` and the only operation you can do (in C code)
with them is to take its address. Such extern should reside in a special
section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong
vs weak semantics stays the same as with Kconfig externs. If symbol is not
found in /proc/kallsyms, this will be a failure for strong (non-weak) extern,
but will be defaulted to 0 for weak externs.
If the same symbol is defined multiple times in /proc/kallsyms, then it will
be error if any of the associated addresses differs. In that case, address is
ambiguous, so libbpf falls on the side of caution, rather than confusing user
with randomly chosen address.
In the future, once kernel is extended with variables BTF information, such
ksym externs will be supported in a typed version, which will allow BPF
program to read variable's contents directly, similarly to how it's done for
fentry/fexit input arguments.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-3-andriin@fb.com
Switch existing Kconfig externs to be just one of few possible kinds of more
generic externs. This refactoring is in preparation for ksymbol extern
support, added in the follow up patch. There are no functional changes
intended.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20200619231703.738941-2-andriin@fb.com
Add a test that checks that pedit adjusts port numbers of tcp and udp
packets.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a bunch of getter for various aspects of BPF map. Some of these attribute
(e.g., key_size, value_size, type, etc) are available right now in struct
bpf_map_def, but this patch adds getter allowing to fetch them individually.
bpf_map_def approach isn't very scalable, when ABI stability requirements are
taken into account. It's much easier to extend libbpf and add support for new
features, when each aspect of BPF map has separate getter/setter.
Getters follow the common naming convention of not explicitly having "get" in
its name: bpf_map__type() returns map type, bpf_map__key_size() returns
key_size. Setters, though, explicitly have set in their name:
bpf_map__set_type(), bpf_map__set_key_size().
This patch ensures we now have a getter and a setter for the following
map attributes:
- type;
- max_entries;
- map_flags;
- numa_node;
- key_size;
- value_size;
- ifindex.
bpf_map__resize() enforces unnecessary restriction of max_entries > 0. It is
unnecessary, because libbpf actually supports zero max_entries for some cases
(e.g., for PERF_EVENT_ARRAY map) and treats it specially during map creation
time. To allow setting max_entries=0, new bpf_map__set_max_entries() setter is
added. bpf_map__resize()'s behavior is preserved for backwards compatibility
reasons.
Map ifindex getter is added as well. There is a setter already, but no
corresponding getter. Fix this assymetry as well. bpf_map__set_ifindex()
itself is converted from void function into error-returning one, similar to
other setters. The only error returned right now is -EBUSY, if BPF map is
already loaded and has corresponding FD.
One lacking attribute with no ability to get/set or even specify it
declaratively is numa_node. This patch fixes this gap and both adds
programmatic getter/setter, as well as adds support for numa_node field in
BTF-defined map.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200621062112.3006313-1-andriin@fb.com
Systems that doesn't yet have the very latest linux/bpf.h header, enum
bpf_stats_type will be undefined, causing compilation warnings. Prevents this
by forward-declaring enum.
Fixes: 0bee106716 ("libbpf: Add support for command BPF_ENABLE_STATS")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200621031159.2279101-1-andriin@fb.com
Add selftests to test access to map pointers from bpf program for all
map types except struct_ops (that one would need additional work).
verifier test focuses mostly on scenarios that must be rejected.
prog_tests test focuses on accessing multiple fields both scalar and a
nested struct from bpf program and verifies that those fields have
expected values.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/139a6a17f8016491e39347849b951525335c6eb4.1592600985.git.rdna@fb.com
There are multiple use-cases when it's convenient to have access to bpf
map fields, both `struct bpf_map` and map type specific struct-s such as
`struct bpf_array`, `struct bpf_htab`, etc.
For example while working with sock arrays it can be necessary to
calculate the key based on map->max_entries (some_hash % max_entries).
Currently this is solved by communicating max_entries via "out-of-band"
channel, e.g. via additional map with known key to get info about target
map. That works, but is not very convenient and error-prone while
working with many maps.
In other cases necessary data is dynamic (i.e. unknown at loading time)
and it's impossible to get it at all. For example while working with a
hash table it can be convenient to know how much capacity is already
used (bpf_htab.count.counter for BPF_F_NO_PREALLOC case).
At the same time kernel knows this info and can provide it to bpf
program.
Fill this gap by adding support to access bpf map fields from bpf
program for both `struct bpf_map` and map type specific fields.
Support is implemented via btf_struct_access() so that a user can define
their own `struct bpf_map` or map type specific struct in their program
with only necessary fields and preserve_access_index attribute, cast a
map to this struct and use a field.
For example:
struct bpf_map {
__u32 max_entries;
} __attribute__((preserve_access_index));
struct bpf_array {
struct bpf_map map;
__u32 elem_size;
} __attribute__((preserve_access_index));
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(max_entries, 4);
__type(key, __u32);
__type(value, __u32);
} m_array SEC(".maps");
SEC("cgroup_skb/egress")
int cg_skb(void *ctx)
{
struct bpf_array *array = (struct bpf_array *)&m_array;
struct bpf_map *map = (struct bpf_map *)&m_array;
/* .. use map->max_entries or array->map.max_entries .. */
}
Similarly to other btf_struct_access() use-cases (e.g. struct tcp_sock
in net/ipv4/bpf_tcp_ca.c) the patch allows access to any fields of
corresponding struct. Only reading from map fields is supported.
For btf_struct_access() to work there should be a way to know btf id of
a struct that corresponds to a map type. To get btf id there should be a
way to get a stringified name of map-specific struct, such as
"bpf_array", "bpf_htab", etc for a map type. Two new fields are added to
`struct bpf_map_ops` to handle it:
* .map_btf_name keeps a btf name of a struct returned by map_alloc();
* .map_btf_id is used to cache btf id of that struct.
To make btf ids calculation cheaper they're calculated once while
preparing btf_vmlinux and cached same way as it's done for btf_id field
of `struct bpf_func_proto`
While calculating btf ids, struct names are NOT checked for collision.
Collisions will be checked as a part of the work to prepare btf ids used
in verifier in compile time that should land soon. The only known
collision for `struct bpf_htab` (kernel/bpf/hashtab.c vs
net/core/sock_map.c) was fixed earlier.
Both new fields .map_btf_name and .map_btf_id must be set for a map type
for the feature to work. If neither is set for a map type, verifier will
return ENOTSUPP on a try to access map_ptr of corresponding type. If
just one of them set, it's verifier misconfiguration.
Only `struct bpf_array` for BPF_MAP_TYPE_ARRAY and `struct bpf_htab` for
BPF_MAP_TYPE_HASH are supported by this patch. Other map types will be
supported separately.
The feature is available only for CONFIG_DEBUG_INFO_BTF=y and gated by
perfmon_capable() so that unpriv programs won't have access to bpf map
fields.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/6479686a0cd1e9067993df57b4c3eef0e276fec9.1592600985.git.rdna@fb.com
Quite a lot of fixes here for no single reason. There's a collection of
the usual sort of device specific fixes and also a bunch of people have
been working on spidev and the userspace test program spidev_test so
they've got an unusually large collection of small fixes.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl7wl/8THGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0NJBB/4oTOxPIfkKQyPcSTQB8FC6AHJ5Z8FT
/4dtK4U32rn5Sms/CHNXMMFxomBgB46Ud19KJ/bEgqugJfI9Vl0bl9fZT+sDyVU3
rFPthVluIprFe1bITyfcgYmjRoznLRdzj9LMFsAWIe3Vmy584TdvXbl8n5COu5Wh
6sR6SuQnEn65x1HC4++3IsG70+ogz81sFRapk/7jXwS+YdwzWIqtN2s2IlUcTmcy
ylijl04XUSyo6e/v/sjZaaRmIz72mW+HbMKW4iC0mGZ6yuACoy7zop9BuwAg6S7z
dFDyqTZ5gq72ZeJc7fuco1y7jqQoKUC1oTRghPlbq2XG3Huta80l415l
=32lr
-----END PGP SIGNATURE-----
Merge tag 'spi-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"Quite a lot of fixes here for no single reason.
There's a collection of the usual sort of device specific fixes and
also a bunch of people have been working on spidev and the userspace
test program spidev_test so they've got an unusually large collection
of small fixes"
* tag 'spi-fix-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spidev: fix a potential use-after-free in spidev_release()
spi: spidev: fix a race between spidev_release and spidev_remove
spi: stm32-qspi: Fix error path in case of -EPROBE_DEFER
spi: uapi: spidev: Use TABs for alignment
spi: spi-fsl-dspi: Free DMA memory with matching function
spi: tools: Add macro definitions to fix build errors
spi: tools: Make default_tx/rx and input_tx static
spi: dt-bindings: amlogic, meson-gx-spicc: Fix schema for meson-g12a
spi: rspi: Use requested instead of maximum bit rate
spi: spidev_test: Use %u to format unsigned numbers
spi: sprd: switch the sequence of setting WDG_LOAD_LOW and _HIGH
On some platforms the default encoding is not utf-8, which causes an
UnicodeDecodeError when reading the flamegraph template and writing the
flamegraph
Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200619153232.203537-1-agerstmayr@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since iproute2 commit f72c3ad00f3b ("tc: m_tunnel_key: add options
support for vxlan"), the geneve opt output use key word "geneve_opts"
instead of "geneve_opt". To make compatibility for both old and new
iproute2, let's accept both "geneve_opt" and "geneve_opts".
Suggested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new strict mode functionality is tested in different configurations and
on different network namespaces.
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Have recordmcount work with > 64K sections (to support LTO)
- kprobe RCU fixes
- Correct a kprobe critical section with missing mutex
- Remove redundant arch_disarm_kprobe() call
- Fix lockup when kretprobe triggers within kprobe_flush_task()
- Fix memory leak in fetch_op_data operations
- Fix sleep in atomic in ftrace trace array sample code
- Free up memory on failure in sample trace array code
- Fix incorrect reporting of function_graph fields in format file
- Fix quote within quote parsing in bootconfig
- Fix return value of bootconfig tool
- Add testcases for bootconfig tool
- Fix maybe uninitialized warning in ftrace pid file code
- Remove unused variable in tracing_iter_reset()
- Fix some typos
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXu1jrRQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qoCMAP91nOccE3X+Nvc3zET3isDWnl1tWJxk
icsBgN/JwBRuTAD/dnWTHIWM2/5lTiagvyVsmINdJHP6JLr8T7dpN9tlxAQ=
=Cuo7
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Have recordmcount work with > 64K sections (to support LTO)
- kprobe RCU fixes
- Correct a kprobe critical section with missing mutex
- Remove redundant arch_disarm_kprobe() call
- Fix lockup when kretprobe triggers within kprobe_flush_task()
- Fix memory leak in fetch_op_data operations
- Fix sleep in atomic in ftrace trace array sample code
- Free up memory on failure in sample trace array code
- Fix incorrect reporting of function_graph fields in format file
- Fix quote within quote parsing in bootconfig
- Fix return value of bootconfig tool
- Add testcases for bootconfig tool
- Fix maybe uninitialized warning in ftrace pid file code
- Remove unused variable in tracing_iter_reset()
- Fix some typos
* tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Fix maybe-uninitialized compiler warning
tools/bootconfig: Add testcase for show-command and quotes test
tools/bootconfig: Fix to return 0 if succeeded to show the bootconfig
tools/bootconfig: Fix to use correct quotes for value
proc/bootconfig: Fix to use correct quotes for value
tracing: Remove unused event variable in tracing_iter_reset
tracing/probe: Fix memleak in fetch_op_data operations
trace: Fix typo in allocate_ftrace_ops()'s comment
tracing: Make ftrace packed events have align of 1
sample-trace-array: Remove trace_array 'sample-instance'
sample-trace-array: Fix sleeping function called from invalid context
kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
kprobes: Remove redundant arch_disarm_kprobe() call
kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex
kprobes: Use non RCU traversal APIs on kprobe_tables if possible
kprobes: Suppress the suspicious RCU warning on kprobes
recordmcount: support >64k sections
- Few ptrace fixes mostly for strace and seccomp_bpf kernel tests
findings.
- Cleanup unused pm callbacks in virtio ccw.
- Replace kmalloc + memset with kzalloc in crypto.
- Use $(LD) for vDSO linkage to make clang happy.
- Fix vDSO clock_getres() to preserve the same behaviour as
posix_get_hrtimer_res().
- Fix workqueue cpumask warning when NUMA=n and nr_node_ids=2.
- Reduce SLSB writes during input processing, improve warnings and
cleanup qdio_data usage in qdio.
- Few fixes to use scnprintf() instead of snprintf().
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl7uJy8ACgkQjYWKoQLX
FBitMwgAovHP6O19ZS2RE2Ps20CjM+z0sLLGHF6aMrV7OqmOWrNnFzN4jT2j42Ck
idSZ6sehVd3Uj6K8NnzrlSS3sjGRhVaQJEjjN+rLyw0HBwxspJJfW5HgcoMtqNH1
oo+nt+zw5jk+6MqHx4QEwTxN5rgGs6UMhiLIAIlkDu4bivgohvGUxe4RUrN/mINx
cdYqomCkvovLT5sBTaWyXKNCDAdAWgNpOfdqc9MjOUXSbUg3lrUol0gUULzenPo7
wUN+sZ0di0Ox0+2+4m8LU1av/kMTLSSvnR9DW5KdpGTon1nwpZcdJnhI5o1v7uaU
pIaMOYNieEHJ2DnieR9iBBSbGoNCmw==
=gkgN
-----END PGP SIGNATURE-----
Merge tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- a few ptrace fixes mostly for strace and seccomp_bpf kernel tests
findings
- cleanup unused pm callbacks in virtio ccw
- replace kmalloc + memset with kzalloc in crypto
- use $(LD) for vDSO linkage to make clang happy
- fix vDSO clock_getres() to preserve the same behaviour as
posix_get_hrtimer_res()
- fix workqueue cpumask warning when NUMA=n and nr_node_ids=2
- reduce SLSB writes during input processing, improve warnings and
cleanup qdio_data usage in qdio
- a few fixes to use scnprintf() instead of snprintf()
* tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: fix syscall_get_error for compat processes
s390/qdio: warn about unexpected SLSB states
s390/qdio: clean up usage of qdio_data
s390/numa: let NODES_SHIFT depend on NEED_MULTIPLE_NODES
s390/vdso: fix vDSO clock_getres()
s390/vdso: Use $(LD) instead of $(CC) to link vDSO
s390/protvirt: use scnprintf() instead of snprintf()
s390: use scnprintf() in sys_##_prefix##_##_name##_show
s390/crypto: use scnprintf() instead of snprintf()
s390/zcrypt: use kzalloc
s390/virtio: remove unused pm callbacks
s390/qdio: reduce SLSB writes during Input Queue processing
selftests/seccomp: s390 shares the syscall and return value register
s390/ptrace: fix setting syscall number
s390/ptrace: pass invalid syscall numbers to tracing
s390/ptrace: return -ENOSYS when invalid syscall is supplied
s390/seccomp: pass syscall arguments via seccomp_data
s390/qdio: fine-tune SLSB update
This Kselftest update for Linux 5.8-rc2 consists of:
- ftrace "requires:" list for simplifying and unifying requirement
checks for each test case, adding "requires:" line instead of
checking required ftrace interfaces in each test case.
- a minor spelling correction patch
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl7s440ACgkQCwJExA0N
Qxyl1A//YDbxzJlB6zrmSmagIKC0KGKDcSrAJNsV6K5lIEf7yPi2DkvQT6jpo27M
Xtv+GTF8nuAJRuRceawKp622o+jT7dd7dUt0FFInsj2xPwTLFy0E+AvZ5j2cqps7
gqcdOeHU/u6QrprUX4JCoOf0NdjJsKwzfOXiaOlBGG26bqMZFOnhbd/DclwWbStJ
gOIqMxp7feQ9yjm5ngSZVmQ6sbM+6ZwsywD1kXrSL68gQ+s3segnzDoaoHSFoGwj
lysOHbx3+/Q+pH18nnFAp6H9csr+iMVqlj1FGN62JUVhkaMdlFhnk/cpO97Lvuwf
nOU3BC0FWSjvquAhL2ZqGnfeXBoHSO9wfpOcC/Dyro4QozvlzigM1PBS3WoRbxzl
2ArZ985xhSqeuiJKa4CwLQWR495FOasm4gHBe9nVrqaiwAORZfdqeBgTJBTw8Y7K
W3+DmA4ttZE6QNZokfgpGLWT4K9PRm8F5ZdzUX8+sqjAQmlMp30LW7lhhAlbprqc
dKUAIB1dJwwT2gPsa/ntIf1SZEbMD8FlrAZDDBNZkId/IdgtRUrl9OCPGAwyrRJ7
pA4YxqBQqB1hwL51Dmk1KOlhePIfeDu9DFzUPgKETvuZA6ITmyM/artYaexcpDoF
7pvbLNsNsL7yuNF8y7wCSFj4IP8kPYRCBpmXR4hhugJI15M14dw=
=nqUq
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest cleanups from Shuah Khan:
- ftrace "requires:" list for simplifying and unifying requirement
checks for each test case, adding "requires:" line instead of
checking required ftrace interfaces in each test case.
- a minor spelling correction patch
* tag 'linux-kselftest-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/ftrace: Support ":README" suffix for requires
selftests/ftrace: Support ":tracer" suffix for requires
selftests/ftrace: Convert check_filter_file() with requires list
selftests/ftrace: Convert required interface checks into requires list
selftests/ftrace: Add "requires:" list support
selftests/ftrace: Return unsupported for the unconfigured features
selftests/ftrace: Allow ":" in description
tools: testing: ftrace: trigger: fix spelling mistake
The ETF qdisc can queue skbs that it could not pace on the errqueue.
Address a few issues in the selftest
- recv buffer size was too small, and incorrectly calculated
- compared errno to ee_code instead of ee_errno
- missed invalid request error type
v2:
- fix a few checkpatch --strict indentation warnings
Fixes: ea6a547669 ("selftests/net: make so_txtime more robust to timer variance")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Relicense it to be compatible with the rest of bpftool files.
Suggested-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200619222024.519774-1-andriin@fb.com
Added two test_verifier subtests for 32bit pointer/scalar arithmetic
with BPF_SUB operator. They are passing verifier now.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200618234632.3321367-1-yhs@fb.com
Since many compilers cannot disable KCOV with a function attribute,
help it to NOP out any __sanitizer_cov_*() calls injected in noinstr
code.
This turns:
12: e8 00 00 00 00 callq 17 <lockdep_hardirqs_on+0x17>
13: R_X86_64_PLT32 __sanitizer_cov_trace_pc-0x4
into:
12: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
13: R_X86_64_NONE __sanitizer_cov_trace_pc-0x4
Just like recordmcount does.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
This provides infrastructure to rewrite instructions; this is
immediately useful for helping out with KCOV-vs-noinstr, but will
also come in handy for a bunch of variable sized jump-label patches
that are still on ice.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
With there being multiple ways to change the ELF data, let's more
concisely track modification.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
When build perf with ASan or UBSan, if libasan or libubsan can not find,
the feature-glibc is 0 and there exists the following error log which is
wrong, because we can find gnu/libc-version.h in /usr/include,
glibc-devel is also installed.
[yangtiezhu@linux perf]$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address'
BUILD: Doing 'make -j4' parallel build
HOSTCC fixdep.o
HOSTLD fixdep-in.o
LINK fixdep
<stdin>:1:0: warning: -fsanitize=address and -fsanitize=kernel-address are not supported for this target
<stdin>:1:0: warning: -fsanitize=address not supported for this target
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:393: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
Makefile.perf:224: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
Makefile:69: recipe for target 'all' failed
make: *** [all] Error 2
[yangtiezhu@linux perf]$ ls /usr/include/gnu/libc-version.h
/usr/include/gnu/libc-version.h
After install libasan and libubsan, the feature-glibc is 1 and the build
process is success, so the cause is related with libasan or libubsan, we
should check them and print an error log to reflect the reality.
Committer testing:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:401: *** No libasan found, please install libasan. Stop.
make[1]: *** [Makefile.perf:231: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
$
$ sudo dnf install libasan
<SNIP>
Installed:
libasan-9.3.1-2.fc31.x86_64
$
$
$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libbfd: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
<SNIP>
CC /tmp/build/perf/util/pmu-flex.o
FLEX /tmp/build/perf/util/expr-flex.c
CC /tmp/build/perf/util/expr-bison.o
CC /tmp/build/perf/util/expr.o
CC /tmp/build/perf/util/expr-flex.o
CC /tmp/build/perf/util/parse-events-flex.o
CC /tmp/build/perf/util/parse-events.o
LD /tmp/build/perf/util/intel-pt-decoder/perf-in.o
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
INSTALL python-scripts
INSTALL perf_completion-script
INSTALL perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
$ ldd ~/bin/perf | grep asan
libasan.so.5 => /lib64/libasan.so.5 (0x00007f0904164000)
$
And if we rebuild without -fsanitize-address:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
$ make O=/tmp/build/perf -C tools/perf/ install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libbfd: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
GEN /tmp/build/perf/common-cmds.h
CC /tmp/build/perf/exec-cmd.o
<SNIP>
INSTALL perf_completion-script
INSTALL perf-tip
make: Leaving directory '/home/acme/git/perf/tools/perf'
$ ldd ~/bin/perf | grep asan
$
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: tiezhu yang <yangtiezhu@loongson.cn>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1592445961-28044-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In order to move pointer checks like IS_ERR_VALUE() out of the hotpath
and into the reader path of a trace event, user space tools need to be
able to parse that. IS_ERR_VALUE() is defined as:
#define IS_ERR_VALUE() unlikely((unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO)
Which eventually turns into:
__builtin_expect(!!((unsigned long)(void *)(x) >= (unsigned long)-4095), 0)
Now the traceevent parser can handle most of that except for the
__builtin_expect(), which needs to be added.
Link: https://lore.kernel.org/linux-mm/20200320055823.27089-3-jaewon31.kim@samsung.com/
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jaewon Kim <jaewon31.kim@samsung.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200324200956.821799393@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit c61f13eaa1 ("gcc-plugins: Add structleak for more stack
initialization") added "__attribute__((user))" to the user when
stackleak detector is enabled. This now appears in the field format of
system call trace events for system calls that have user buffers. The
"__attribute__((user))" breaks the parsing in libtraceevent. That needs
to be handled.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jaewon Kim <jaewon31.kim@samsung.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org
Cc: linux-trace-devel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200324200956.663647256@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's several locations that open code realloc and strcat() to append
text to strings. Add an append() function that takes a delimiter and a
string to append to another string.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jaewon Lim <jaewon31.kim@samsung.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: linux-mm@kvack.org
Cc: linux-trace-devel@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lore.kernel.org/lkml/20200324200956.515118403@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Alexei Starovoitov says:
====================
pull-request: bpf 2020-06-17
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 2 day(s) which contain
a total of 14 files changed, 158 insertions(+), 59 deletions(-).
The main changes are:
1) Important fix for bpf_probe_read_kernel_str() return value, from Andrii.
2) [gs]etsockopt fix for large optlen, from Stanislav.
3) devmap allocation fix, from Toke.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We are relying on the fact, that we can pass > sizeof(int) optvals
to the SOL_IP+IP_FREEBIND option (the kernel will take first 4 bytes).
In the BPF program we check that we can only touch PAGE_SIZE bytes,
but the real optlen is PAGE_SIZE * 2. In both cases, we override it to
some predefined value and trim the optlen.
Also, let's modify exiting IP_TOS usecase to test optlen=0 case
where BPF program just bypasses the data as is.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200617010416.93086-2-sdf@google.com
To pick the changes from:
b383a73f2b ("fs/ext4: Introduce DAX inode flag")
And silence this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
It causes various beautifiers for things like fspick, fsmount, etc (see
below) to get rebuilt, but this specific change doesn't make 'perf
trace' be capable of decoding anything new, as we still don't decode
what comes from ioctls, just its cmds.
Details about the update:
$ cp include/uapi/linux/fs.h tools/include/uapi/linux/fs.h
$ git diff
diff --git a/tools/include/uapi/linux/fs.h b/tools/include/uapi/linux/fs.h
index 379a612f8f1d..f44eb0a04afd 100644
--- a/tools/include/uapi/linux/fs.h
+++ b/tools/include/uapi/linux/fs.h
@@ -262,6 +262,7 @@ struct fsxattr {
#define FS_EA_INODE_FL 0x00200000 /* Inode used for large EA */
#define FS_EOFBLOCKS_FL 0x00400000 /* Reserved for ext4 */
#define FS_NOCOW_FL 0x00800000 /* Do not cow file */
+#define FS_DAX_FL 0x02000000 /* Inode is DAX */
#define FS_INLINE_DATA_FL 0x10000000 /* Reserved for ext4 */
#define FS_PROJINHERIT_FL 0x20000000 /* Create with parents projid */
#define FS_CASEFOLD_FL 0x40000000 /* Folder is case insensitive */
$ m
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j8' parallel build
INSTALL GTK UI
CC /tmp/build/perf/builtin-trace.o
DESCEND plugins
CC /tmp/build/perf/trace/beauty/fsmount.o
CC /tmp/build/perf/trace/beauty/fspick.o
CC /tmp/build/perf/trace/beauty/mount_flags.o
CC /tmp/build/perf/trace/beauty/move_mount.o
CC /tmp/build/perf/trace/beauty/renameat.o
CC /tmp/build/perf/trace/beauty/sync_file_range.o
INSTALL trace_plugins
LD /tmp/build/perf/trace/beauty/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
7e5b3c267d ("x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation")
Addressing these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
With this one will be able to use these new AMD MSRs in filters, by
name, e.g.:
# perf trace -e msr:* --filter "msr==IA32_MCU_OPT_CTRL"
^C#
Using -v we can see how it sets up the tracepoint filters, converting
from the string in the filter to the numeric value:
# perf trace -v -e msr:* --filter "msr==IA32_MCU_OPT_CTRL"
Using CPUID GenuineIntel-6-8E-A
0x123
New filter for msr:read_msr: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
0x123
New filter for msr:write_msr: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
0x123
New filter for msr:rdpmc: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
mmap size 528384B
^C#
The updating process shows how this affects tooling in more detail:
$ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
--- tools/arch/x86/include/asm/msr-index.h 2020-06-03 10:36:09.959910238 -0300
+++ arch/x86/include/asm/msr-index.h 2020-06-17 10:04:20.235052901 -0300
@@ -128,6 +128,10 @@
#define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */
#define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */
+/* SRBDS support */
+#define MSR_IA32_MCU_OPT_CTRL 0x00000123
+#define RNGDS_MITG_DIS BIT(0)
+
#define MSR_IA32_SYSENTER_CS 0x00000174
#define MSR_IA32_SYSENTER_ESP 0x00000175
#define MSR_IA32_SYSENTER_EIP 0x00000176
$ set -o vi
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
$ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
$ diff -u before after
--- before 2020-06-17 10:05:49.653114752 -0300
+++ after 2020-06-17 10:06:01.777258731 -0300
@@ -51,6 +51,7 @@
[0x0000011e] = "IA32_BBL_CR_CTL3",
[0x00000120] = "IDT_MCR_CTRL",
[0x00000122] = "IA32_TSX_CTRL",
+ [0x00000123] = "IA32_MCU_OPT_CTRL",
[0x00000140] = "MISC_FEATURES_ENABLES",
[0x00000174] = "IA32_SYSENTER_CS",
[0x00000175] = "IA32_SYSENTER_ESP",
$
The related change to cpu-features.h affects this:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
This shouldn't be affecting that 'perf bench' entry:
$ find tools/perf/ -type f | xargs grep SRBDS
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Gross <mgross@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get some newer headers that got out of sync with the copies in tools/
so that we can try to have the tools/perf/ build clean for v5.8 with
fewer pull requests.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes segmentation fault when trying to interpret zstd-compressed data
with perf script:
```
$ perf record -z ls
...
[ perf record: Captured and wrote 0,010 MB perf.data, compressed (original 0,001 MB, ratio is 2,190) ]
$ memcheck perf script
...
==67911== Invalid read of size 4
==67911== at 0x5568188: ZSTD_decompressStream (in /usr/lib/libzstd.so.1.4.5)
==67911== by 0x6E726B: zstd_decompress_stream (zstd.c:100)
==67911== by 0x65729C: perf_session__process_compressed_event (session.c:72)
==67911== by 0x6598E8: perf_session__process_user_event (session.c:1583)
==67911== by 0x65BA59: reader__process_events (session.c:2177)
==67911== by 0x65BA59: __perf_session__process_events (session.c:2234)
==67911== by 0x65BA59: perf_session__process_events (session.c:2267)
==67911== by 0x5A7397: __cmd_script (builtin-script.c:2447)
==67911== by 0x5A7397: cmd_script (builtin-script.c:3840)
==67911== by 0x5FE9D2: run_builtin (perf.c:312)
==67911== by 0x711627: handle_internal_command (perf.c:364)
==67911== by 0x711627: run_argv (perf.c:408)
==67911== by 0x711627: main (perf.c:538)
==67911== Address 0x71d8 is not stack'd, malloc'd or (recently) free'd
```
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LPU-Reference: 20200612230333.72140-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit c34a06c56d ("tools/bpftool: Add ringbuf map to a list of known
map types") added the symbolic "ringbuf" name. Document it in the bpftool
map command docs and usage as well.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200616113303.8123-1-tklauser@distanz.ch
Add testcases for the return value of the command to show
bootconfig in initrd, and double/single quotes selecting.
Link: http://lkml.kernel.org/r/159230247428.65555.2109472942519215104.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix bootconfig to return 0 if succeeded to show the bootconfig
in initrd. Without this fix, "bootconfig INITRD" command
returns !0 even if the command succeeded to show the bootconfig.
Link: http://lkml.kernel.org/r/159230246566.65555.11891772258543514487.stgit@devnote2
Cc: stable@vger.kernel.org
Fixes: 950313ebf7 ("tools: bootconfig: Add bootconfig command")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix bootconfig tool to select double or single quotes
correctly according to the value.
If a bootconfig value includes a double quote character,
we must use single-quotes to quote that value.
Link: http://lkml.kernel.org/r/159230245697.65555.12444299015852932304.stgit@devnote2
Cc: stable@vger.kernel.org
Fixes: 950313ebf7 ("tools: bootconfig: Add bootconfig command")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Verify that setns() reports EINVAL when an fd is passed that refers to an
open file but the file is not a file descriptor useable to interact with
namespaces.
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Cyril Hrubis <chrubis@suse.cz>
Link: https://lore.kernel.org/lkml/20200615085836.GR12456@shao2-debian
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Add ":README" suffix support for the requires list, so that
the testcase can list up the required string for README file
to the requires list.
Note that the required string is treated as a fixed string,
instead of regular expression. Also, the testcase can specify
a string containing spaces with quotes. E.g.
# requires: "place: [<module>:]<symbol>":README
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Add ":tracer" suffix support for the requires list, so that
the testcase can list up the required tracer (e.g. function)
to the requires list.
For example, if the testcase requires function_graph tracer,
it can write requires list as below instead of checking
available_tracers.
# requires: function_graph:tracer
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Since check_filter_file() is basically checking the filter
tracefs file, we can convert it into requires list.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Introduce "requires:" list to check required ftrace interface
for each test. This will simplify the interface checking code
and unify the error message. Another good point is, it can
skip the ftrace initializing.
Note that this requires list must be written as a shell
comment.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
As same as other test cases, return unsupported if kprobe_events
or argument access feature are not found.
There can be a new arch which does not port those features yet,
and an older kernel which doesn't support it.
Those can not enable the features.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Allow ":" in the description line. Currently if there is ":"
in the test description line, the description is cut at that
point, but that was unintended.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
s390 cannot set syscall number and reture code at the same time,
so set the appropriate flag to indicate it.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>