linux_dsm_epyc7002/net/sched
WANG Cong dc327f8931 net_sched: close another race condition in tcf_mirred_release()
We saw the following extra refcount release on veth device:

  kernel: [7957821.463992] unregister_netdevice: waiting for mesos50284 to become free. Usage count = -1

Since we heavily use mirred action to redirect packets to veth, I think
this is caused by the following race condition:

CPU0:
tcf_mirred_release(): (in RCU callback)
	struct net_device *dev = rcu_dereference_protected(m->tcfm_dev, 1);

CPU1:
mirred_device_event():
        spin_lock_bh(&mirred_list_lock);
        list_for_each_entry(m, &mirred_list, tcfm_list) {
                if (rcu_access_pointer(m->tcfm_dev) == dev) {
                        dev_put(dev);
                        /* Note : no rcu grace period necessary, as
                         * net_device are already rcu protected.
                         */
                        RCU_INIT_POINTER(m->tcfm_dev, NULL);
                }
        }
        spin_unlock_bh(&mirred_list_lock);

CPU0:
tcf_mirred_release():
        spin_lock_bh(&mirred_list_lock);
        list_del(&m->tcfm_list);
        spin_unlock_bh(&mirred_list_lock);
        if (dev)               // <======== Stil refers to the old m->tcfm_dev
                dev_put(dev);  // <======== dev_put() is called on it again

The action init code path is good because it is impossible to modify
an action that is being removed.

So, fix this by moving everything under the spinlock.

Fixes: 2ee22a90c7 ("net_sched: act_mirred: remove spinlock in fast path")
Fixes: 6bd00b8506 ("act_mirred: fix a race condition on mirred_list")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-17 12:40:28 -04:00
..
act_api.c sched: align nlattr properly when needed 2016-04-26 12:00:49 -04:00
act_bpf.c bpf: wire in data and data_end for cls_act_bpf 2016-05-06 16:01:54 -04:00
act_connmark.c sched: align nlattr properly when needed 2016-04-26 12:00:49 -04:00
act_csum.c sched: align nlattr properly when needed 2016-04-26 12:00:49 -04:00
act_gact.c net/sched: act_gact: Update statistics when offloaded to hardware 2016-05-16 13:43:50 -04:00
act_ife.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-15 13:32:48 -04:00
act_ipt.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-15 13:32:48 -04:00
act_meta_mark.c Support to encoding decoding skb mark on IFE action 2016-03-01 17:15:23 -05:00
act_meta_skbprio.c Support to encoding decoding skb prio on IFE action 2016-03-01 17:15:23 -05:00
act_mirred.c net_sched: close another race condition in tcf_mirred_release() 2016-05-17 12:40:28 -04:00
act_nat.c sched: align nlattr properly when needed 2016-04-26 12:00:49 -04:00
act_pedit.c sched: align nlattr properly when needed 2016-04-26 12:00:49 -04:00
act_police.c net_sched: add network namespace support for tc actions 2016-02-25 14:16:21 -05:00
act_simple.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-15 13:32:48 -04:00
act_skbedit.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-15 13:32:48 -04:00
act_vlan.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-15 13:32:48 -04:00
cls_api.c net: sched: fix call_rcu() race on classifier module unloads 2015-05-21 18:48:18 -04:00
cls_basic.c net_sched: destroy proto tp when all filters are gone 2015-03-09 15:35:55 -04:00
cls_bpf.c bpf: wire in data and data_end for cls_act_bpf 2016-05-06 16:01:54 -04:00
cls_cgroup.c cls_cgroup: factor out classid retrieval 2015-07-20 12:41:30 -07:00
cls_flow.c sched: cls_flow: use skb_to_full_sk() helper 2015-11-08 20:56:39 -05:00
cls_flower.c net/sched: cls_flower: Hardware offloaded filters statistics support 2016-05-16 13:43:50 -04:00
cls_fw.c net: revert "net_sched: move tp->root allocation into fw_init()" 2015-09-24 14:33:30 -07:00
cls_route.c net_sched: destroy proto tp when all filters are gone 2015-03-09 15:35:55 -04:00
cls_rsvp6.c
cls_rsvp.c
cls_rsvp.h net_sched: convert rsvp to call tcf_exts_destroy from rcu callback 2015-08-26 11:01:45 -07:00
cls_tcindex.c net_sched: convert tcindex to call tcf_exts_destroy from rcu callback 2015-08-26 11:01:44 -07:00
cls_u32.c net: cls_u32: Add support for skip-sw flag to tc u32 classifier. 2016-05-16 13:30:57 -04:00
em_canid.c net: sched: remove tcf_proto from ematch calls 2014-10-06 18:02:32 -04:00
em_cmp.c net_sched: cleanups 2011-01-19 23:31:12 -08:00
em_ipset.c netfilter: x_tables: Pass struct net in xt_action_param 2015-09-18 21:58:14 +02:00
em_meta.c qdisc: constify meta_type_ops structures 2016-04-14 00:35:30 -04:00
em_nbyte.c net: sched: remove tcf_proto from ematch calls 2014-10-06 18:02:32 -04:00
em_text.c net: Remove state argument from skb_find_text() 2015-02-22 15:59:54 -05:00
em_u32.c net_sched: cleanups 2011-01-19 23:31:12 -08:00
ematch.c ematch: Fix auto-loading of ematch modules. 2015-02-20 15:30:56 -05:00
Kconfig Support to encoding decoding skb prio on IFE action 2016-03-01 17:15:23 -05:00
Makefile Support to encoding decoding skb prio on IFE action 2016-03-01 17:15:23 -05:00
sch_api.c sched: align nlattr properly when needed 2016-04-26 12:00:49 -04:00
sch_atm.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_blackhole.c net/sched: make sch_blackhole.c explicitly non-modular 2015-10-09 07:52:28 -07:00
sch_cbq.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_choke.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_codel.c codel: split into multiple files 2016-04-25 16:44:27 -04:00
sch_drr.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_dsmark.c net_sched: dsmark: use qdisc_dequeue_peeked() 2016-03-08 14:35:13 -05:00
sch_fifo.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_fq_codel.c fq_codel: fix memory limitation drift 2016-05-16 21:54:24 -04:00
sch_fq.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_generic.c net: remove dev->trans_start 2016-05-04 14:16:50 -04:00
sch_gred.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_hfsc.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_hhf.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_htb.c sched: use nla_put_u64_64bit() 2016-04-25 15:09:09 -04:00
sch_ingress.c net, sched: add clsact qdisc 2016-01-10 22:13:15 -05:00
sch_mq.c net: sched: use pfifo_fast for non real queues 2016-03-03 17:38:46 -05:00
sch_mqprio.c net: sched: use pfifo_fast for non real queues 2016-03-03 17:38:46 -05:00
sch_multiq.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_netem.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-04 00:52:29 -04:00
sch_pie.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_plug.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_prio.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_qfq.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_red.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_sfb.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_sfq.c net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
sch_tbf.c sched: use nla_put_u64_64bit() 2016-04-25 15:09:09 -04:00
sch_teql.c net: sched: fix skb->protocol use in case of accelerated vlan path 2015-01-13 17:51:08 -05:00