Commit Graph

16469 Commits

Author SHA1 Message Date
David S. Miller
4e7df119d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next:

1) Add .release_ops to properly unroll .select_ops, use it from nft_compat.
   After this change, we can remove list of extensions too to simplify this
   codebase.

2) Update amanda conntrack helper to support v3.4, from Florian Tham.

3) Get rid of the obsolete BUGPRINT macro in ebtables, from
   Florian Westphal.

4) Merge IPv4 and IPv6 masquerading infrastructure into one single module.
   From Florian Westphal.

5) Patchset to remove nf_nat_l3proto structure to get rid of
   indirections, from Florian Westphal.

6) Skip unnecessary conntrack timeout updates in case the value is
   still the same, also from Florian Westphal.

7) Remove unnecessary 'fall through' comments in empty switch cases,
   from Li RongQing.

8) Fix lookup to fixed size hashtable sets on big endian with 32-bit keys.

9) Incorrect logic to deactivate path of fixed size hashtable sets,
   element was being tested to self.

10) Remove nft_hash_key(), the bitmap set is always selected for 16-bit
    keys.

11) Use boolean whenever possible in IPVS codebase, from Andrea Claudi.

12) Enter close state in conntrack if RST matches exact sequence number,
    from Florian Westphal.

13) Initialize dst_cache in tunnel extension, from wenxu.

14) Pass protocol as u16 to xt_check_match and xt_check_target, from
    Li RongQing.

15) SCTP header is granted to be in a linear area from IPVS NAT handler,
    from Xin Long.

16) Don't steal packets coming from slave VRF device from the
    ip_sabotage_in() path, from David Ahern.

17) Fix unsafe update of basechain stats, from Li RongQing.

18) Make sure CONNTRACK_LOCKS is power of 2 to let compiler optimize
    modulo operation as bitwise AND, from Li RongQing.

19) Use device_attribute instead of internal definition in the IDLETIMER
    target, from Sami Tolvanen.

20) Merge redir, masq and IPv4/IPv6 NAT chain types, from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02 14:01:04 -08:00
David S. Miller
9eb359140c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-02 12:54:35 -08:00
Lucas Bates
255c1c7279 tc-testing: Allow test cases to be skipped
By adding a check for an optional key/value pair to the test case
data, individual test cases may be skipped to prevent tdc from
aborting a test run due to setup or teardown failure.

If a test case is skipped, it will still appear in the results
output to allow for a consistent number of executed tests in each
run. However, the test will be marked as skipped.

This support for skipping extends to any plugins that may generate
additional results for each executed test.

Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 23:05:06 -08:00
Paolo Abeni
ada641ff6e selftests: fixes for UDP GRO
The current implementation for UDP GRO tests is racy: the receiver
may flush the RX queue while the sending is still transmitting and
incorrectly report RX errors, with a wrong number of packet received.

Add explicit timeouts to the receiver for both connection activation
(first packet received for UDP) and reception completion, so that
in the above critical scenario the receiver will wait for the
transfer completion.

Fixes: 3327a9c463 ("selftests: add functionals test for UDP GRO")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01 11:24:00 -08:00
David Ahern
be9cefe796 selftests: rtnetlink: use internal netns switch for ip commands
'ip' can switch network namespaces internally and then run a given
command relative to that namespace without the need to fork and exec
another ip instance. Update all references of the form:
    ip netns exec "$testns" ip ...
to
    ip -netns "$testns" ...

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-28 13:02:43 -08:00
Paolo Abeni
b3cc4f8a8a selftests: pmtu: add explicit tests for PMTU exceptions cleanup
Add a couple of new tests, explicitly checking that the kernel
timely releases PMTU exceptions on related device removal.
This is mostly a regression test vs the issue fixed by
commit f5b51fe804 ("ipv6: route: purge exception on removal")

Only 2 new test cases have been added, instead of extending all
the existing ones, because the reproducer requires executing
several commands and would slow down too much the tests otherwise.

v2 -> v3:
 - more cleanup, still from Stefano

v1 -> v2:
 - several script cleanups, as suggested by Stefano

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27 21:28:59 -08:00
Paolo Abeni
651eb32e56 selftests: pmtu: disable DAD in all namespaces
Otherwise, the configured IPv6 address could be still "tentative"
at test time, possibly causing tests failures.
We can also drop some sleep along the code and decrease the
timeout for most commands so that the test runtime decreases.

v1 -> v2:
 - fix comment (Stefano)

Fixes: d1f1b9cbf3 ("selftests: net: Introduce first PMTU test")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-27 21:28:59 -08:00
Florian Westphal
3bf195ae60 netfilter: nat: merge nf_nat_ipv4,6 into nat core
before:
   text    data     bss     dec     hex filename
  16566    1576    4136   22278    5706 nf_nat.ko
   3598	    844	      0	   4442	   115a	nf_nat_ipv6.ko
   3187	    844	      0	   4031	    fbf	nf_nat_ipv4.ko

after:
   text    data     bss     dec     hex filename
  22948    1612    4136   28696    7018 nf_nat.ko

... with ipv4/v6 nat now provided directly via nf_nat.ko.

Also changes:
       ret = nf_nat_ipv4_fn(priv, skb, state);
       if (ret != NF_DROP && ret != NF_STOLEN &&
into
	if (ret != NF_ACCEPT)
		return ret;

everywhere.

The nat hooks never should return anything other than
ACCEPT or DROP (and the latter only in rare error cases).

The original code uses multi-line ANDing including assignment-in-if:
        if (ret != NF_DROP && ret != NF_STOLEN &&
           !(IPCB(skb)->flags & IPSKB_XFRM_TRANSFORMED) &&
            (ct = nf_ct_get(skb, &ctinfo)) != NULL) {

I removed this while moving, breaking those in separate conditionals
and moving the assignments into extra lines.

checkpatch still generates some warnings:
 1. Overly long lines (of moved code).
    Breaking them is even more ugly. so I kept this as-is.
 2. use of extern function declarations in a .c file.
    This is necessary evil, we must call
    nf_nat_l3proto_register() from the nat core now.
    All l3proto related functions are removed later in this series,
    those prototypes are then removed as well.

v2: keep empty nf_nat_ipv6_csum_update stub for CONFIG_IPV6=n case.
v3: remove IS_ENABLED(NF_NAT_IPV4/6) tests, NF_NAT_IPVx toggles
    are removed here.
v4: also get rid of the assignments in conditionals.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27 10:49:55 +01:00
Vlad Buslov
a110ae7096 tc-testing: gitignore, ignore local tdc config file
Comment in tdc_config.py recommends putting customizations in
tdc_config_local.py file that wasn't included in gitignore. Add the local
config file to gitignore.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26 09:20:42 -08:00
Roopa Prabhu
da640bc051 tools: selftests: rtnetlink: add testcases for vxlan flag sets
This patch extends rtnetlink.sh to cover some vxlan flag
netlink attribute sets.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26 08:54:37 -08:00
Jiri Pirko
81d56d8292 selftests: mlxsw: spectrum-2: Add massive delta rehash test
Do insertions and removal of filters during rehash in higher volumes.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
f6eaf1c3ac selftests: mlxsw: spectrum-2: Check migrate end trace
Add checking of newly added trace.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
d39ca90f59 selftests: mlxsw: spectrum-2: Add IPv6 variant of simple delta rehash test
Track the basic codepaths of delta rehash handling,
using mlxsw tracepoints. Use IPv6 addresses.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Vlad Buslov
5ce4645171 selftests: concurrency: add test to verify parallel replace/delete
Implement test that runs 5 instances of tc replace filter in parallel with
5 instances of tc del filter from same tp instance. Each instance uses its
own filter handle and key range.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:49:59 -08:00
Vlad Buslov
be6b294dbd selftests: concurrency: add test to verify parallel add/delete
Implement test that runs 5 instances of tc add filter in parallel with 5
instances of tc del filter from same tp instance. Each instance uses its
own filter handle and key range.

Extend tdc_multibatch.py with additional options required to implement the
test: common prefix for all generated batch files, first value of filter
handle range, MAC address prefix modifier. These are necessary to allow
creating batch files with unique keys and handle ranges with multiple
invocation of tdc_multibatch.py helper script.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:49:59 -08:00
Vlad Buslov
a788b302c5 selftests: concurrency: add test to verify concurrent delete
Implement test that verifies concurrent deletion of rules by executing 10
tc instances that delete flower filters in same handle range. In this case
only one tc instance succeeds in deleting a filter with particular handle.
To mitigate expected failures of all other instances, run tc with 'force'
option to continue processing batch file in case of errors and expect xargs
to return code '123' that indicates that invocation of command(s) exited
with error in range 1-125.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:49:59 -08:00
Vlad Buslov
424c5bd46a selftests: concurrency: add test to verify concurrent replace
Implement test that verifies concurrent replacement of rules by executing
10 tc instances that replace flower filters in same handle range.

Extend tdc_multibatch.py script with new optional CLI argument that is used
to generate all batch files with same filter handle range.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:49:59 -08:00
Vlad Buslov
4ba21de23a selftests: concurrency: add test to verify parallel rules replace
Implement test that verifies parallel rules replacement by adding 1 million
flower filters and then replacing them with 10 concurrent tc instances.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:49:59 -08:00
Vlad Buslov
596952fc4f selftests: concurrency: add test to verify parallel rules deletion
Implement test that verifies parallel rules deletion by adding 1 million
flower filters and then deleting them with 10 concurrent tc instances.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:49:58 -08:00
Vlad Buslov
450ef62033 selftests: concurrency: add test to verify parallel rules insertion
Implement test that verifies parallel rules insertion by adding 1 million
flower filters with 10 concurrent tc instances. Put it to standalone
'concurrency' category.

Implement tdc_multibatch.py helper script that is used to generate multiple
batch files for concurrent tc execution. Extend config with new 'BATCH_DIR'
variable to specify temporary output directory that is used to store batch
files generated by tdc_multibatch.py.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:49:58 -08:00
Vlad Buslov
3b07270db8 selftests: tdc_batch.py: add options needed for concurrency tests
Extend tdc_batch.py with several optional CLI arguments that are used for
implementation of concurrency tests in following patches in this set:

- Add optional argument to specify range of filter handles used in batch
  file [fitler_handle, filter_handle+number). This is needed for testing
  filter deletion where it is necessary to know exact handles of configured
  filters.

- Add optional argument to specify filter operation type (possible values
  are ['add', 'del', 'replace']) instead of hardcoded "add" value. This
  allows generation of batches for filter addition, deletion and
  replacement.

- Add optional argument to allow user to change mac address prefix that is
  used for all filters in batch. This is necessary to allow generating
  multiple batches with unique flower classifier keys.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:49:58 -08:00
David S. Miller
70f3522614 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Three conflicts, one of which, for marvell10g.c is non-trivial and
requires some follow-up from Heiner or someone else.

The issue is that Heiner converted the marvell10g driver over to
use the generic c45 code as much as possible.

However, in 'net' a bug fix appeared which makes sure that a new
local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0
is cleared.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 12:06:19 -08:00
Thadeu Lima de Souza Cascardo
af548a27b1 selftests: fib_tests: sleep after changing carrier. again.
Just like commit e2ba732a16 ("selftests: fib_tests: sleep after
changing carrier"), wait one second to allow linkwatch to propagate the
carrier change to the stack.

There are two sets of carrier tests. The first slept after the carrier
was set to off, and when the second set ran, it was likely that the
linkwatch would be able to run again without much delay, reducing the
likelihood of a race. However, if you run 'fib_tests.sh -t carrier' on a
loop, you will quickly notice the failures.

Sleeping on the second set of tests make the failures go away.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23 18:34:20 -08:00
Alban Crequy
7c0cdf0b39 bpf, lpm: fix lookup bug in map_delete_elem
trie_delete_elem() was deleting an entry even though it was not matching
if the prefixlen was correct. This patch adds a check on matchlen.

Reproducer:

$ sudo bpftool map create /sys/fs/bpf/mylpm type lpm_trie key 8 value 1 entries 128 name mylpm flags 1
$ sudo bpftool map update pinned /sys/fs/bpf/mylpm key hex 10 00 00 00 aa bb cc dd value hex 01
$ sudo bpftool map dump pinned /sys/fs/bpf/mylpm
key: 10 00 00 00 aa bb cc dd  value: 01
Found 1 element
$ sudo bpftool map delete pinned /sys/fs/bpf/mylpm key hex 10 00 00 00 ff ff ff ff
$ echo $?
0
$ sudo bpftool map dump pinned /sys/fs/bpf/mylpm
Found 0 elements

A similar reproducer is added in the selftests.

Without the patch:

$ sudo ./tools/testing/selftests/bpf/test_lpm_map
test_lpm_map: test_lpm_map.c:485: test_lpm_delete: Assertion `bpf_map_delete_elem(map_fd, key) == -1 && errno == ENOENT' failed.
Aborted

With the patch: test_lpm_map runs without errors.

Fixes: e454cf5958 ("bpf: Implement map_delete_elem for BPF_MAP_TYPE_LPM_TRIE")
Cc: Craig Gallek <kraig@google.com>
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-22 16:17:53 +01:00
Vakul Garg
203ef5f1ff selftest/tls: Add test to verify received 'type' of non-data record
Test case 'control_msg' has been updated to peek non-data record and
then verify the type of record received. Subsequently, the same record
is retrieved without MSG_PEEK flag in recvmsg().

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-20 11:05:55 -08:00
David S. Miller
885e631959 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-02-16

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong.

2) test all bpf progs in alu32 mode, from Jiong.

3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin.

4) support for IP encap in lwt bpf progs, from Peter.

5) remove XDP_QUERY_XSK_UMEM dead code, from Jan.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-16 22:56:34 -08:00
Andrii Nakryiko
5aab392c55 tools/libbpf: support bigger BTF data sizes
While it's understandable why kernel limits number of BTF types to 65535
and size of string section to 64KB, in libbpf as user-space library it's
too restrictive. E.g., pahole converting DWARF to BTF type information
for Linux kernel generates more than 3 million BTF types and more than
3MB of strings, before deduplication. So to allow btf__dedup() to do its
work, we need to be able to load bigger BTF sections using btf__new().

Singed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-16 18:47:18 -08:00
Peter Oskolkov
9d6b3584a7 selftests: bpf: test_lwt_ip_encap: add negative tests.
As requested by David Ahern:

- add negative tests (no routes, explicitly unreachable destinations)
  to exercize error handling code paths;
- do not exit on test failures, but instead print a summary of
  passed/failed tests at the end.

Future patches will add TSO and VRF tests.

Signed-off-by: Peter Oskolkov <posk@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-16 18:41:44 -08:00
Florian Fainelli
ff326d3cdf selftests: forwarding: Add some missing configuration symbols
For the forwarding selftests to work, we need network namespaces when
using veth/vrf otherwise ping/ping6 commands like these:

ip vrf exec vveth0 /bin/ping 192.0.2.2 -c 10 -i 0.1 -w 5

will fail because network namespaces may not be enabled.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15 20:32:22 -08:00
David S. Miller
3313da8188 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The netfilter conflicts were rather simple overlapping
changes.

However, the cls_tcindex.c stuff was a bit more complex.

On the 'net' side, Cong is fixing several races and memory
leaks.  Whilst on the 'net-next' side we have Vlad adding
the rtnl-ness support.

What I've decided to do, in order to resolve this, is revert the
conversion over to using a workqueue that Cong did, bringing us back
to pure RCU.  I did it this way because I believe that either Cong's
races don't apply with have Vlad did things, or Cong will have to
implement the race fix slightly differently.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15 12:38:38 -08:00
Linus Torvalds
6e7bd3b549 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix MAC address setting in mac80211 pmsr code, from Johannes Berg.

 2) Probe SFP modules after being attached, from Russell King.

 3) Byte ordering bug in SMC rx_curs_confirmed code, from Ursula Braun.

 4) Revert some r8169 changes that are causing regressions, from Heiner
    Kallweit.

 5) Fix spurious connection timeouts in netfilter nat code, from Florian
    Westphal.

 6) SKB leak in tipc, from Hoang Le.

 7) Short packet checkum issue in mlx4, similar to a previous mlx5
    change, from Saeed Mahameed. The issue is that whilst padding bytes
    are usually zero, it is not guarateed and the hardware doesn't take
    the padding bytes into consideration when generating the checksum.

 8) Fix various races in cls_tcindex, from Cong Wang.

 9) Need to set stream ext to NULL before freeing in SCTP code, from Xin
    Long.

10) Fix locking in phy_is_started, from Heiner Kallweit.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
  net: ethernet: freescale: set FEC ethtool regs version
  net: hns: Fix object reference leaks in hns_dsaf_roce_reset()
  mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs
  net: phy: fix potential race in the phylib state machine
  net: phy: don't use locking in phy_is_started
  selftests: fix timestamping Makefile
  net: dsa: bcm_sf2: potential array overflow in bcm_sf2_sw_suspend()
  net: fix possible overflow in __sk_mem_raise_allocated()
  dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit
  net: phy: fix interrupt handling in non-started states
  sctp: set stream ext to NULL after freeing it in sctp_stream_outq_migrate
  sctp: call gso_reset_checksum when computing checksum in sctp_gso_segment
  net/mlx5e: XDP, fix redirect resources availability check
  net/mlx5: Fix a compilation warning in events.c
  net/mlx5: No command allowed when command interface is not ready
  net/mlx5e: Fix NULL pointer derefernce in set channels error flow
  netfilter: nft_compat: use-after-free when deleting targets
  team: avoid complex list operations in team_nl_cmd_options_set()
  net_sched: fix two more memory leaks in cls_tcindex
  net_sched: fix a memory leak in cls_tcindex
  ...
2019-02-15 08:00:11 -08:00
Andrey Ignatov
789f6bab84 libbpf: Introduce bpf_object__btf
Add new accessor for bpf_object to get opaque struct btf * from it.

struct btf * is needed for all operations with BTF and it's present in
bpf_object. The only thing missing is a way to get it.

Example use-case is to get BTF key_type_id and value_type_id for a map in
bpf_object. It can be done with btf__get_map_kv_tids() but that function
requires struct btf *.

Similar API can be added for struct btf_ext but no use-case for it yet.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-15 15:20:54 +01:00
Andrey Ignatov
1a11a4c74f libbpf: Introduce bpf_map__resize
Add bpf_map__resize() to change max_entries for a map.

Quite often necessary map size is unknown at compile time and can be
calculated only at run time.

Currently the following approach is used to do so:
* bpf_object__open_buffer() to open Elf file from a buffer;
* bpf_object__find_map_by_name() to find relevant map;
* bpf_map__def() to get map attributes and create struct
  bpf_create_map_attr from them;
* update max_entries in bpf_create_map_attr;
* bpf_create_map_xattr() to create new map with updated max_entries;
* bpf_map__reuse_fd() to replace the map in bpf_object with newly
  created one.

And after all this bpf_object can finally be loaded. The map will have
new size.

It 1) is quite a lot of steps; 2) doesn't take BTF into account.

For "2)" even more steps should be made and some of them require changes
to libbpf (e.g. to get struct btf * from bpf_object).

Instead the whole problem can be solved by introducing simple
bpf_map__resize() API that checks the map and sets new max_entries if
the map is not loaded yet.

So the new steps are:
* bpf_object__open_buffer() to open Elf file from a buffer;
* bpf_object__find_map_by_name() to find relevant map;
* bpf_map__resize() to update max_entries.

That's much simpler and works with BTF.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-15 15:20:42 +01:00
Andrii Nakryiko
d931206476 tools: sync uapi/linux/if_link.h header
Syncing if_link.h that got out of sync.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-14 15:31:39 -08:00
Andrii Nakryiko
1ad9cbb890 tools/bpf: replace bzero with memset
bzero() call is deprecated and superseded by memset().

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Reported-by: David Laight <david.laight@aculab.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-14 15:31:39 -08:00
Deepa Dinamani
39c1331962 selftests: fix timestamping Makefile
The clean target in the makefile conflicts with the generic
kselftests lib.mk, and fails to properly remove the compiled
test programs.

Remove the redundant rule, the TEST_GEN_FILES will be already
removed by the CLEAN macro in lib.mk.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Shuah Khan <shuah@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-14 12:03:16 -05:00
Peter Oskolkov
0fde56e438 selftests: bpf: add test_lwt_ip_encap selftest
This patch adds a bpf self-test to cover BPF_LWT_ENCAP_IP mode
in bpf_lwt_push_encap.

Covered:
- encapping in LWT_IN and LWT_XMIT
- IPv4 and IPv6

A follow-up patch will add GSO and VRF-enabled tests.

Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-13 18:27:55 -08:00
Peter Oskolkov
755db4771c bpf: sync <kdir>/include/.../bpf.h with tools/include/.../bpf.h
This patch copies changes in bpf.h done by a previous patch
in this patchset from the kernel uapi include dir into tools
uapi include dir.

Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-13 18:27:55 -08:00
Jiri Pirko
f5c7bd93c4 selftests: mlxsw: avoid double sourcing of lib.sh
Don't source lib.sh 2 times and make the script work with ifnames
passed on the command line.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12 12:03:29 -05:00
Prashant Bhole
ebbed0f46e tools: bpftool: doc, add text about feature-subcommand
This patch adds missing information about feature-subcommand in
bpftool.rst

Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-12 17:06:18 +01:00
Jiong Wang
64e39ee2c8 selftests: bpf: relax sub-register mode compilation criteria
Sub-register mode compilation was enabled only when there are eBPF "v3"
processor supports at both compilation time inside LLVM and runtime inside
kernel.

Given separation betwen build and test server could be often, this patch
removes the runtime support criteria.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-11 20:31:38 -08:00
Jiong Wang
bd4aed0ee7 selftests: bpf: centre kernel bpf objects under new subdir "progs"
At the moment, all kernel bpf objects are listed under BPF_OBJ_FILES.
Listing them manually sometimes causing patch conflict when people are
adding new testcases simultaneously.

It is better to centre all the related source files under a subdir
"progs", then auto-generate the object file list.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-11 20:31:38 -08:00
Jiong Wang
4836b4637e selftests: bpf: extend sub-register mode compilation to all bpf object files
At the moment, we only do extra sub-register mode compilation on bpf object
files used by "test_progs". These object files are really loaded and
executed.

This patch further extends sub-register mode compilation to all bpf object
files, even those without corresponding runtime tests. Because this could
help testing LLVM sub-register code-gen, kernel bpf selftest has much more
C testcases with reasonable size and complexity compared with LLVM
testsuite which only contains unit tests.

There were some file duplication inside BPF_OBJ_FILES_DUAL_COMPILE which
is removed now.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-11 20:31:38 -08:00
Jiong Wang
1727a9dce6 selftests: bpf: add "alu32" to .gitignore
"alu32" is a build dir and contains various files for BPF sub-register
code-gen testing.

This patch tells git to ignore it.

Suggested-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-11 20:31:38 -08:00
Martin KaFai Lau
e0b27b3f97 bpf: Add test_sock_fields for skb->sk and bpf_tcp_sock
This patch adds a C program to show the usage on
skb->sk and bpf_tcp_sock.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Martin KaFai Lau
fb47d1d931 bpf: Add skb->sk, bpf_sk_fullsock and bpf_tcp_sock tests to test_verifer
This patch tests accessing the skb->sk and the new helpers,
bpf_sk_fullsock and bpf_tcp_sock.

The errstr of some existing "reference tracking" tests is changed
with s/bpf_sock/sock/ and s/socket/sock/ where "sock" is from the
verifier's reg_type_str[].

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Martin KaFai Lau
281f9e7572 bpf: Sync bpf.h to tools/
This patch sync the uapi bpf.h to tools/.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10 19:46:17 -08:00
Bob Tracy
842fc0f5dc tools uapi: fix Alpha support
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Bob Tracy <rct@frus.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2019-02-10 19:16:24 -08:00
Linus Torvalds
212146f080 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "A couple of kernel side fixes:

   - Fix the Intel uncore driver on certain hardware configurations

   - Fix a CPU hotplug related memory allocation bug

   - Remove a spurious WARN()

  ... plus also a handful of perf tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf script python: Add Python3 support to tests/attr.py
  perf trace: Support multiple "vfs_getname" probes
  perf symbols: Filter out hidden symbols from labels
  perf symbols: Add fallback definitions for GELF_ST_VISIBILITY()
  tools headers uapi: Sync linux/in.h copy from the kernel sources
  perf clang: Do not use 'return std::move(something)'
  perf mem/c2c: Fix perf_mem_events to support powerpc
  perf tests evsel-tp-sched: Fix bitwise operator
  perf/core: Don't WARN() for impossible ring-buffer sizes
  perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()
  perf/x86/intel/uncore: Add Node ID mask
2019-02-10 09:48:18 -08:00
Eli Cohen
257eeded20 net: Move all TC actions identifiers to one place
Move all the TC identifiers to one place, to the same enum that defines
the identifier of police action. This makes it easier choose numbers for
new actions since they are now defined in one place. We preserve the
original values for binary compatibility. New IDs should be added inside
the enum.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10 09:28:43 -08:00