linux_dsm_epyc7002/net/batman-adv
Andrew Lunn 1bc4e2b000 batman-adv: Avoid endless loop in bat-on-bat netdevice check
batman-adv checks in different situation if a new device is already on top
of a different batman-adv device. This is done by getting the iflink of a
device and all its parent. It assumes that this iflink is always a parent
device in an acyclic graph. But this assumption is broken by devices like
veth which are actually a pair of two devices linked to each other. The
recursive check would therefore get veth0 when calling dev_get_iflink on
veth1. And it gets veth0 when calling dev_get_iflink with veth1.

Creating a veth pair and loading batman-adv freezes parts of the system

    ip link add veth0 type veth peer name veth1
    modprobe batman-adv

An RCU stall will be detected on the system which cannot be fixed.

    INFO: rcu_sched self-detected stall on CPU
            1: (5264 ticks this GP) idle=3e9/140000000000001/0
    softirq=144683/144686 fqs=5249
             (t=5250 jiffies g=46 c=45 q=43)
    Task dump for CPU 1:
    insmod          R  running task        0   247    245 0x00000008
     ffffffff8151f140 ffffffff8107888e ffff88000fd141c0 ffffffff8151f140
     0000000000000000 ffffffff81552df0 ffffffff8107b420 0000000000000001
     ffff88000e3fa700 ffffffff81540b00 ffffffff8107d667 0000000000000001
    Call Trace:
     <IRQ>  [<ffffffff8107888e>] ? rcu_dump_cpu_stacks+0x7e/0xd0
     [<ffffffff8107b420>] ? rcu_check_callbacks+0x3f0/0x6b0
     [<ffffffff8107d667>] ? hrtimer_run_queues+0x47/0x180
     [<ffffffff8107cf9d>] ? update_process_times+0x2d/0x50
     [<ffffffff810873fb>] ? tick_handle_periodic+0x1b/0x60
     [<ffffffff810290ae>] ? smp_trace_apic_timer_interrupt+0x5e/0x90
     [<ffffffff813bbae2>] ? apic_timer_interrupt+0x82/0x90
     <EOI>  [<ffffffff812c3fd7>] ? __dev_get_by_index+0x37/0x40
     [<ffffffffa0031f3e>] ? batadv_hard_if_event+0xee/0x3a0 [batman_adv]
     [<ffffffff812c5801>] ? register_netdevice_notifier+0x81/0x1a0
    [...]

This can be avoided by checking if two devices are each others parent and
stopping the check in this situation.

Fixes: b7eddd0b39 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven@narfation.org: rewritten description, extracted fix]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
2016-02-16 22:16:33 +08:00
..
bat_algo.h
bat_iv_ogm.c
bitarray.c
bitarray.h
bridge_loop_avoidance.c batman-adv: Avoid recursive call_rcu for batadv_bla_claim 2016-01-16 22:48:29 +08:00
bridge_loop_avoidance.h
debugfs.c
debugfs.h
distributed-arp-table.c
distributed-arp-table.h
fragmentation.c
fragmentation.h
gateway_client.c batman-adv: Only put gw_node list reference when removed 2016-02-16 17:52:25 +08:00
gateway_client.h
gateway_common.c
gateway_common.h
hard-interface.c batman-adv: Avoid endless loop in bat-on-bat netdevice check 2016-02-16 22:16:33 +08:00
hard-interface.h batman-adv: Drop immediate batadv_hard_iface free function 2016-01-16 22:49:51 +08:00
hash.c
hash.h
icmp_socket.c
icmp_socket.h
Kconfig
main.c
main.h
Makefile
multicast.c
multicast.h
network-coding.c batman-adv: Avoid recursive call_rcu for batadv_nc_node 2016-01-16 22:48:41 +08:00
network-coding.h
originator.c batman-adv: Drop immediate orig_node free function 2016-01-16 22:50:00 +08:00
originator.h batman-adv: Drop immediate orig_node free function 2016-01-16 22:50:00 +08:00
packet.h
routing.c
routing.h
send.c
send.h
soft-interface.c
soft-interface.h
sysfs.c
sysfs.h
translation-table.c batman-adv: Only put orig_node_vlan list reference when removed 2016-02-16 17:52:26 +08:00
translation-table.h
types.h