linux_dsm_epyc7002/kernel/bpf
Jakub Sitnicki 035ff358f2 net: Generate reuseport group ID on group creation
Commit 736b46027e ("net: Add ID (if needed) to sock_reuseport and expose
reuseport_lock") has introduced lazy generation of reuseport group IDs that
survive group resize.

By comparing the identifier we check if BPF reuseport program is not trying
to select a socket from a BPF map that belongs to a different reuseport
group than the one the packet is for.

Because SOCKARRAY used to be the only BPF map type that can be used with
reuseport BPF, it was possible to delay the generation of reuseport group
ID until a socket from the group was inserted into BPF map for the first
time.

Now that SOCK{MAP,HASH} can be used with reuseport BPF we have two options,
either generate the reuseport ID on map update, like SOCKARRAY does, or
allocate an ID from the start when reuseport group gets created.

This patch takes the latter approach to keep sockmap free of calls into
reuseport code. This streamlines the reuseport_id access as its lifetime
now matches the longevity of reuseport object.

The cost of this simplification, however, is that we allocate reuseport IDs
for all SO_REUSEPORT users. Even those that don't use SOCKARRAY in their
setups. With the way identifiers are currently generated, we can have at
most S32_MAX reuseport groups, which hopefully is sufficient. If we ever
get close to the limit, we can switch an u64 counter like sk_cookie.

Another change is that we now always call into SOCKARRAY logic to unlink
the socket from the map when unhashing or closing the socket. Previously we
did it only when at least one socket from the group was in a BPF map.

It is worth noting that this doesn't conflict with sockmap tear-down in
case a socket is in a SOCK{MAP,HASH} and belongs to a reuseport
group. sockmap tear-down happens first:

  prot->unhash
  `- tcp_bpf_unhash
     |- tcp_bpf_remove
     |  `- while (sk_psock_link_pop(psock))
     |     `- sk_psock_unlink
     |        `- sock_map_delete_from_link
     |           `- __sock_map_delete
     |              `- sock_map_unref
     |                 `- sk_psock_put
     |                    `- sk_psock_drop
     |                       `- rcu_assign_sk_user_data(sk, NULL)
     `- inet_unhash
        `- reuseport_detach_sock
           `- bpf_sk_reuseport_detach
              `- WRITE_ONCE(sk->sk_user_data, NULL)

Suggested-by: Martin Lau <kafai@fb.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200218171023.844439-10-jakub@cloudflare.com
2020-02-21 22:29:45 +01:00
..
arraymap.c bpf: Add lookup and update batch ops to arraymap 2020-01-15 14:00:35 -08:00
bpf_lru_list.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
bpf_lru_list.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
bpf_struct_ops_types.h bpf: tcp: Support tcp_congestion_ops in bpf 2020-01-09 08:46:18 -08:00
bpf_struct_ops.c bpf: Reuse log from btf_prase_vmlinux() in btf_struct_ops_init() 2020-01-29 16:40:54 +01:00
btf.c bpf: Fix modifier skipping logic 2020-02-04 00:06:07 +01:00
cgroup.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-01-09 12:13:43 -08:00
core.c bpf: Add BPF_FUNC_jiffies64 2020-01-22 16:30:10 -08:00
cpumap.c xdp: Make cpumap flush_list common for all map instances 2019-12-19 21:09:43 -08:00
devmap.c bpf, xdp: Remove no longer required rcu_read_{un}lock() 2020-01-27 11:16:25 +01:00
disasm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
disasm.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
dispatcher.c bpf: Allow to resolve bpf trampoline and dispatcher in unwind 2020-01-25 07:12:40 -08:00
hashtab.c bpf: Add batch ops to all htab bpf map 2020-01-15 14:00:35 -08:00
helpers.c bpf: Add BPF_FUNC_jiffies64 2020-01-22 16:30:10 -08:00
inode.c Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-02-08 13:26:41 -08:00
local_storage.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-12-22 09:54:33 -08:00
lpm_trie.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-22 08:59:24 -04:00
Makefile bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS 2020-01-09 08:46:18 -08:00
map_in_map.c bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS 2020-01-09 08:46:18 -08:00
map_in_map.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
offload.c nsfs: clean-up ns_get_path() signature to return int 2019-12-08 19:09:37 -05:00
percpu_freelist.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
percpu_freelist.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
queue_stack_maps.c bpf: move memory size checks to bpf_map_charge_init() 2019-05-31 16:52:56 -07:00
reuseport_array.c net: Generate reuseport group ID on group creation 2020-02-21 22:29:45 +01:00
stackmap.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-12-03 09:29:50 -08:00
syscall.c bpf: Introduce dynamic program extensions 2020-01-22 23:04:52 +01:00
sysfs_btf.c btf: fix return value check in btf_vmlinux_init() 2019-08-15 22:18:17 -07:00
tnum.c bpf: Fix incorrect verifier simulation of ARSH under ALU32 2020-01-15 13:39:59 -08:00
trampoline.c bpf: Allow to resolve bpf trampoline and dispatcher in unwind 2020-01-25 07:12:40 -08:00
verifier.c bpf: Allow selecting reuseport socket from a SOCKMAP/SOCKHASH 2020-02-21 22:29:45 +01:00
xskmap.c xsk: Make xskmap flush_list common for all map instances 2019-12-19 21:09:43 -08:00