linux_dsm_epyc7002/include
Yunsheng Lin 21c7151092 net: sched: fix packet stuck problem for lockless qdisc
[ Upstream commit a90c57f2cedd52a511f739fb55e6244e22e1a2fb ]

Lockless qdisc has below concurrent problem:
    cpu0                 cpu1
     .                     .
q->enqueue                 .
     .                     .
qdisc_run_begin()          .
     .                     .
dequeue_skb()              .
     .                     .
sch_direct_xmit()          .
     .                     .
     .                q->enqueue
     .             qdisc_run_begin()
     .            return and do nothing
     .                     .
qdisc_run_end()            .

cpu1 enqueue a skb without calling __qdisc_run() because cpu0
has not released the lock yet and spin_trylock() return false
for cpu1 in qdisc_run_begin(), and cpu0 do not see the skb
enqueued by cpu1 when calling dequeue_skb() because cpu1 may
enqueue the skb after cpu0 calling dequeue_skb() and before
cpu0 calling qdisc_run_end().

Lockless qdisc has below another concurrent problem when
tx_action is involved:

cpu0(serving tx_action)     cpu1             cpu2
          .                   .                .
          .              q->enqueue            .
          .            qdisc_run_begin()       .
          .              dequeue_skb()         .
          .                   .            q->enqueue
          .                   .                .
          .             sch_direct_xmit()      .
          .                   .         qdisc_run_begin()
          .                   .       return and do nothing
          .                   .                .
 clear __QDISC_STATE_SCHED    .                .
 qdisc_run_begin()            .                .
 return and do nothing        .                .
          .                   .                .
          .            qdisc_run_end()         .

This patch fixes the above data race by:
1. If the first spin_trylock() return false and STATE_MISSED is
   not set, set STATE_MISSED and retry another spin_trylock() in
   case other CPU may not see STATE_MISSED after it releases the
   lock.
2. reschedule if STATE_MISSED is set after the lock is released
   at the end of qdisc_run_end().

For tx_action case, STATE_MISSED is also set when cpu1 is at the
end if qdisc_run_end(), so tx_action will be rescheduled again
to dequeue the skb enqueued by cpu2.

Clear STATE_MISSED before retrying a dequeuing when dequeuing
returns NULL in order to reduce the overhead of the second
spin_trylock() and __netif_schedule() calling.

Also clear the STATE_MISSED before calling __netif_schedule()
at the end of qdisc_run_end() to avoid doing another round of
dequeuing in the pfifo_fast_dequeue().

The performance impact of this patch, tested using pktgen and
dummy netdev with pfifo_fast qdisc attached:

 threads  without+this_patch   with+this_patch      delta
    1        2.61Mpps            2.60Mpps           -0.3%
    2        3.97Mpps            3.82Mpps           -3.7%
    4        5.62Mpps            5.59Mpps           -0.5%
    8        2.78Mpps            2.77Mpps           -0.3%
   16        2.22Mpps            2.22Mpps           -0.0%

Fixes: 6b3ba9146f ("net: sched: allow qdiscs to handle locking")
Acked-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03 09:00:47 +02:00
..
acpi ACPI: scan: Use unique number for instance_no 2021-03-30 14:32:06 +02:00
asm-generic static_call: Allow module use without exposing static_call_key 2021-03-30 14:31:53 +02:00
clocksource
crypto crypto: poly1305 - fix poly1305_core_setkey() declaration 2021-05-14 09:50:13 +02:00
drm drm/dp/mst: Export drm_dp_get_vc_payload_bw() 2021-02-10 09:29:18 +01:00
dt-bindings ASoC: dt-bindings: lpass: Fix and common up lpass dai ids 2021-02-03 23:28:46 +01:00
keys security: keys: trusted: fix TPM2 authorizations 2021-05-14 09:50:20 +02:00
kunit kunit: fix display of failed expectations for strings 2020-11-10 13:45:15 -07:00
kvm ARM: 2020-10-23 11:17:56 -07:00
linux linux/bits.h: fix compilation error with GENMASK 2021-06-03 09:00:45 +02:00
math-emu
media media: v4l2-ctrls: fix reference to freed memory 2021-05-11 14:47:39 +02:00
memory
misc
net net: sched: fix packet stuck problem for lockless qdisc 2021-06-03 09:00:47 +02:00
pcmcia
ras mm,hwpoison: introduce MF_MSG_UNSPLIT_THP 2020-10-16 11:11:17 -07:00
rdma RDMA: Lift ibdev_to_node from rds to common code 2021-02-26 10:12:59 +01:00
scsi Fix misc new gcc warnings 2021-05-11 14:47:36 +02:00
soc net: dsa: felix: implement port flushing on .phylink_mac_link_down 2021-02-17 11:02:27 +01:00
sound ALSA: hda: intel-nhlt: verify config type 2021-03-09 11:11:14 +01:00
target scsi: target: core: Add cmd length set before cmd complete 2021-03-17 17:06:25 +01:00
trace SUNRPC: Remove trace_xprt_transmit_queued 2021-05-19 10:13:03 +02:00
uapi netfilter: xt_SECMARK: add new revision to fix structure layout 2021-05-19 10:13:06 +02:00
vdso
video gpu: ipu-v3: remove unused functions 2020-10-26 10:42:38 +01:00
xen Xen/gntdev: correct error checking in gntdev_map_grant_pages() 2021-02-23 15:53:24 +01:00