linux_dsm_epyc7002/net
Daniel Borkmann 81d947e2b8 net, sched: fix panic when updating miniq {b,q}stats
While working on fixing another bug, I ran into the following panic
on arm64 by simply attaching clsact qdisc, adding a filter and running
traffic on ingress to it:

  [...]
  [  178.188591] Unable to handle kernel read from unreadable memory at virtual address 810fb501f000
  [  178.197314] Mem abort info:
  [  178.200121]   ESR = 0x96000004
  [  178.203168]   Exception class = DABT (current EL), IL = 32 bits
  [  178.209095]   SET = 0, FnV = 0
  [  178.212157]   EA = 0, S1PTW = 0
  [  178.215288] Data abort info:
  [  178.218175]   ISV = 0, ISS = 0x00000004
  [  178.222019]   CM = 0, WnR = 0
  [  178.224997] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000023cb3f33
  [  178.231531] [0000810fb501f000] *pgd=0000000000000000
  [  178.236508] Internal error: Oops: 96000004 [#1] SMP
  [...]
  [  178.311855] CPU: 73 PID: 2497 Comm: ping Tainted: G        W        4.15.0-rc7+ #5
  [  178.319413] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
  [  178.326887] pstate: 60400005 (nZCv daif +PAN -UAO)
  [  178.331685] pc : __netif_receive_skb_core+0x49c/0xac8
  [  178.336728] lr : __netif_receive_skb+0x28/0x78
  [  178.341161] sp : ffff00002344b750
  [  178.344465] x29: ffff00002344b750 x28: ffff810fbdfd0580
  [  178.349769] x27: 0000000000000000 x26: ffff000009378000
  [...]
  [  178.418715] x1 : 0000000000000054 x0 : 0000000000000000
  [  178.424020] Process ping (pid: 2497, stack limit = 0x000000009f0a3ff4)
  [  178.430537] Call trace:
  [  178.432976]  __netif_receive_skb_core+0x49c/0xac8
  [  178.437670]  __netif_receive_skb+0x28/0x78
  [  178.441757]  process_backlog+0x9c/0x160
  [  178.445584]  net_rx_action+0x2f8/0x3f0
  [...]

Reason is that sch_ingress and sch_clsact are doing mini_qdisc_pair_init()
which sets up miniq pointers to cpu_{b,q}stats from the underlying qdisc.
Problem is that this cannot work since they are actually set up right after
the qdisc ->init() callback in qdisc_create(), so first packet going into
sch_handle_ingress() tries to call mini_qdisc_bstats_cpu_update() and we
therefore panic.

In order to fix this, allocation of {b,q}stats needs to happen before we
call into ->init(). In net-next, there's already such option through commit
d59f5ffa59 ("net: sched: a dflt qdisc may be used with per cpu stats").
However, the bug needs to be fixed in net still for 4.15. Thus, include
these bits to reduce any merge churn and reuse the static_flags field to
set TCQ_F_CPUSTATS, and remove the allocation from qdisc_create() since
there is no other user left. Prashant Bhole ran into the same issue but
for net-next, thus adding him below as well as co-author. Same issue was
also reported by Sandipan Das when using bcc.

Fixes: 46209401f8 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath")
Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2018-January/001190.html
Reported-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Co-authored-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Co-authored-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 15:02:36 -05:00
..
6lowpan License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
9p 9p: add missing module license for xen transport 2018-01-15 13:13:53 -05:00
802 treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
8021q 8021q: fix a memory leak for VLAN 0 device 2018-01-10 15:31:07 -05:00
appletalk treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
atm treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
ax25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
batman-adv batman-adv: Fix lock for ogm cnt access in batadv_iv_ogm_calc_tq 2017-12-04 11:47:33 +01:00
bluetooth treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
bpf bpf: add meta pointer for direct access 2017-09-26 13:36:44 -07:00
bridge net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks 2017-12-18 13:29:01 -05:00
caif caif_usb: use strlcpy() instead of strncpy() 2018-01-10 15:06:14 -05:00
can treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
ceph We have a set of file locking improvements from Zheng, rbd rw/ro 2017-11-21 05:38:32 -10:00
core net: Allow neigh contructor functions ability to modify the primary_key 2018-01-15 14:53:43 -05:00
dcb rtnetlink: make rtnl_register accept a flags parameter 2017-08-09 16:57:38 -07:00
dccp dccp: CVE-2017-8824: use-after-free in DCCP code 2017-12-05 18:08:53 -05:00
decnet treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
dns_resolver KEYS: Fix race between updating and finding a negative key 2017-10-18 09:12:40 +01:00
dsa net: remove duplicate includes 2017-12-13 13:18:46 -05:00
ethernet networking: make skb_push & __skb_push return void pointers 2017-06-16 11:48:40 -04:00
hsr net: hsr: Convert timers to use timer_setup() 2017-10-25 13:00:27 +09:00
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
ife MAINTAINERS: Update Yotam's E-mail 2017-11-01 12:19:03 +09:00
ipv4 ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY 2018-01-15 14:53:43 -05:00
ipv6 ipv6: ip6_make_skb() needs to clear cork.base.dst 2018-01-15 14:19:32 -05:00
ipx Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
iucv iucv: Convert sk_wmem_alloc accesses to refcount_t. 2017-07-03 02:31:22 -07:00
kcm make sock_alloc_file() do sock_release() on failures 2017-12-05 18:39:29 -05:00
key af_key: Fix memory leak in key_notify_policy. 2018-01-10 09:45:11 +01:00
l2tp l2tp: exit_net cleanup check added 2017-11-14 15:45:53 +09:00
l3mdev
lapb treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
llc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-11-15 11:56:19 -08:00
mac80211 mac80211: mesh: drop frames appearing to be from us 2018-01-04 15:51:53 +01:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
mpls Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
ncsi treewide: setup_timer() -> timer_setup() (2 field) 2017-11-21 15:57:09 -08:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-01-08 20:21:39 -08:00
netlabel net/netlabel: Add list_next_rcu() in rcu_dereference(). 2017-11-18 10:32:41 +09:00
netlink netlink: extack needs to be reset each time through loop 2018-01-15 13:50:07 -05:00
netrom treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
nfc treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
nsh openvswitch: enable NSH support 2017-11-08 16:12:33 +09:00
openvswitch Revert "openvswitch: Add erspan tunnel support." 2018-01-15 14:33:16 -05:00
packet net/packet: fix a race in packet_bind() and packet_notifier() 2017-11-28 11:13:30 -05:00
phonet phonet: exit_net cleanup check added 2017-11-14 15:45:53 +09:00
psample MAINTAINERS: Update Yotam's E-mail 2017-11-01 12:19:03 +09:00
qrtr Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-10 10:00:18 +09:00
rds RDS: null pointer dereference in rds_atomic_free_op 2018-01-04 14:19:26 -05:00
rfkill net: rfkill: gpio: Switch to devm_acpi_dev_add_driver_gpios() 2017-06-13 11:07:51 +02:00
rose treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
rxrpc rxrpc: Use correct netns source in rxrpc_release_sock() 2017-12-03 10:05:20 -05:00
sched net, sched: fix panic when updating miniq {b,q}stats 2018-01-16 15:02:36 -05:00
sctp sctp: do not allow the v4 socket to bind a v4mapped v6 address 2018-01-16 14:24:20 -05:00
smc net/smc: Fix preinitialization of buf_desc in __smc_buf_create() 2017-11-24 01:33:34 +09:00
strparser strparser: Call sock_owned_by_user_nocheck 2017-12-28 14:28:22 -05:00
sunrpc NFS client fixes for Linux 4.15-rc4 2017-12-16 13:12:53 -08:00
switchdev net: bridge: Add/del switchdev object on host join/leave 2017-11-10 13:41:40 +09:00
tipc tipc: fix a memory leak in tipc_nl_node_get_link() 2018-01-15 13:45:50 -05:00
tls net/tls: Fix inverted error codes to avoid endless loop 2018-01-15 14:21:57 -05:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-11-04 09:26:51 +09:00
vmw_vsock VSOCK: fix outdated sk_state value in hvs_release() 2017-12-05 15:07:37 -05:00
wimax License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
wireless cfg80211: check dev_set_name() return value 2018-01-15 11:35:06 +01:00
x25 treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
xfrm xfrm: Fix a race in the xdst pcpu cache. 2018-01-10 12:14:28 +01:00
compat.c net: compat: assert the size of cmsg copied in is as expected 2017-09-20 15:36:18 -07:00
Kconfig net: Remove CONFIG_NETFILTER_DEBUG and _ASSERT() macros. 2017-09-04 13:25:20 +02:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
socket.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-01-10 17:55:42 -08:00
sysctl_net.c