linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Paolo Abeni	b830650c1a	net: let skb_orphan_partial wake-up waiters. commit 9adc89af724f12a03b47099cd943ed54e877cd59 upstream. Currently the mentioned helper can end-up freeing the socket wmem without waking-up any processes waiting for more write memory. If the partially orphaned skb is attached to an UDP (or raw) socket, the lack of wake-up can hang the user-space. Even for TCP sockets not calling the sk destructor could have bad effects on TSQ. Address the issue using skb_orphan to release the sk wmem before setting the new sock_efree destructor. Additionally bundle the whole ownership update in a new helper, so that later other potential users could avoid duplicate code. v1 -> v2: - use skb_orphan() instead of sort of open coding it (Eric) - provide an helper for the ownership change (Eric) Fixes: `f6ba8d33cf` ("netem: fix skb_orphan_partial()") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:03 +02:00
Maciej Żenczykowski	5d9216b851	net-ipv6: bugfix - raw & sctp - switch to ipv6_can_nonlocal_bind() commit 630e4576f83accf90366686f39808d665d8dbecc upstream. Found by virtue of ipv6 raw sockets not honouring the per-socket IP{,V6}_FREEBIND setting. Based on hits found via: git grep '[.]ip_nonlocal_bind' We fix both raw ipv6 sockets to honour IP{,V6}_FREEBIND and IP{,V6}_TRANSPARENT, and we fix sctp sockets to honour IP{,V6}_TRANSPARENT (they already honoured FREEBIND), and not just the ipv6 'ip_nonlocal_bind' sysctl. The helper is defined as: static inline bool ipv6_can_nonlocal_bind(struct net net, struct inet_sock inet) { return net->ipv6.sysctl.ip_nonlocal_bind \|\| inet->freebind \|\| inet->transparent; } so this change only widens the accepted opt-outs and is thus a clean bugfix. I'm not entirely sure what 'fixes' tag to add, since this is AFAICT an ancient bug, but IMHO this should be applied to stable kernels as far back as possible. As such I'm adding a 'fixes' tag with the commit that originally added the helper, which happened in 4.19. Backporting to older LTS kernels (at least 4.9 and 4.14) would presumably require open-coding it or backporting the helper as well. Other possibly relevant commits: v4.18-rc6-1502-g83ba4645152d net: add helpers checking if socket can be bound to nonlocal address v4.18-rc6-1431-gd0c1f01138c4 net/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind v4.14-rc5-271-gb71d21c274ef sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND v4.7-rc7-1883-g9b9742022888 sctp: support ipv6 nonlocal bind v4.1-12247-g35a256fee52c ipv6: Nonlocal bind Cc: Lorenzo Colitti <lorenzo@google.com> Fixes: `83ba464515` ("net: add helpers checking if socket can be bound to nonlocal address") Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-By: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Kurt Kanzenbach	b82816d778	net: hsr: Reset MAC header for Tx path commit 9d6803921a16f4d768dc41a75375629828f4d91e upstream. Reset MAC header in HSR Tx path. This is needed, because direct packet transmission, e.g. by specifying PACKET_QDISC_BYPASS does not reset the MAC header. This has been observed using the following setup: \|$ ip link add name hsr0 type hsr slave1 lan0 slave2 lan1 supervision 45 version 1 \|$ ifconfig hsr0 up \|$ ./test hsr0 The test binary is using mmap'ed sockets and is specifying the PACKET_QDISC_BYPASS socket option. This patch resolves the following warning on a non-patched kernel: \|[ 112.725394] ------------[ cut here ]------------ \|[ 112.731418] WARNING: CPU: 1 PID: 257 at net/hsr/hsr_forward.c:560 hsr_forward_skb+0x484/0x568 \|[ 112.739962] net/hsr/hsr_forward.c:560: Malformed frame (port_src hsr0) The warning can be safely removed, because the other call sites of hsr_forward_skb() make sure that the skb is prepared correctly. Fixes: `d346a3fae3` ("packet: introduce PACKET_QDISC_BYPASS socket option") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Johannes Berg	9b9c910ccc	mac80211: fix TXQ AC confusion commit 1153a74768a9212daadbb50767aa400bc6a0c9b0 upstream. Normally, TXQs have txq->tid = tid; txq->ac = ieee80211_ac_from_tid(tid); However, the special management TXQ actually has txq->tid = IEEE80211_NUM_TIDS; // 16 txq->ac = IEEE80211_AC_VO; This makes sense, but ieee80211_ac_from_tid(16) is the same as ieee80211_ac_from_tid(0) which is just IEEE80211_AC_BE. Now, normally this is fine. However, if the netdev queues were stopped, then the code in ieee80211_tx_dequeue() will propagate the stop from the interface (vif->txqs_stopped[]) if the AC 2 (ieee80211_ac_from_tid(txq->tid)) is marked as stopped. On wake, however, __ieee80211_wake_txqs() will wake the TXQ if AC 0 (txq->ac) is woken up. If a driver stops all queues with ieee80211_stop_tx_queues() and then wakes them again with ieee80211_wake_tx_queues(), the ieee80211_wake_txqs() tasklet will run to resync queue and TXQ state. If all queues were woken, then what'll happen is that _ieee80211_wake_txqs() will run in order of HW queues 0-3, typically (and certainly for iwlwifi) corresponding to ACs 0-3, so it'll call __ieee80211_wake_txqs() for each AC in order 0-3. When __ieee80211_wake_txqs() is called for AC 0 (VO) that'll wake up the management TXQ (remember its tid is 16), and the driver's wake_tx_queue() will be called. That tries to get a frame, which will immediately stop the TXQ again, because now we check against AC 2, and AC 2 hasn't yet been marked as woken up again in sdata->vif.txqs_stopped[] since we're only in the __ieee80211_wake_txqs() call for AC 0. Thus, the management TXQ will never be started again. Fix this by checking txq->ac directly instead of calculating the AC as ieee80211_ac_from_tid(txq->tid). Fixes: `adf8ed01e4` ("mac80211: add an optional TXQ for other PS-buffered frames") Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20210323210500.bf4d50afea4a.I136ffde910486301f8818f5442e3c9bf8670a9c4@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Ben Greear	cc357c2935	mac80211: fix time-is-after bug in mlme commit 7d73cd946d4bc7d44cdc5121b1c61d5d71425dea upstream. The incorrect timeout check caused probing to happen when it did not need to happen. This in turn caused tx performance drop for around 5 seconds in ath10k-ct driver. Possibly that tx drop is due to a secondary issue, but fixing the probe to not happen when traffic is running fixes the symptom. Signed-off-by: Ben Greear <greearb@candelatech.com> Fixes: `9abf4e4983` ("mac80211: optimize station connection monitor") Acked-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20210330230749.14097-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Johannes Berg	cc1a702e6e	cfg80211: check S1G beacon compat element length commit b5ac0146492fc5c199de767e492be8a66471011a upstream. We need to check the length of this element so that we don't access data beyond its end. Fix that. Fixes: `9eaffe5078` ("cfg80211: convert S1G beacon to scan results") Link: https://lore.kernel.org/r/20210408142826.f6f4525012de.I9fdeff0afdc683a6024e5ea49d2daa3cd2459d11@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Johannes Berg	fea52345f4	nl80211: fix potential leak of ACL params commit abaf94ecc9c356d0b885a84edef4905cdd89cfdd upstream. In case nl80211_parse_unsol_bcast_probe_resp() results in an error, need to "goto out" instead of just returning to free possibly allocated data. Fixes: `7443dcd1f1` ("nl80211: Unsolicited broadcast probe response support") Link: https://lore.kernel.org/r/20210408142833.d8bc2e2e454a.If290b1ba85789726a671ff0b237726d4851b5b0f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Johannes Berg	42e4450e37	nl80211: fix beacon head validation commit 9a6847ba1747858ccac53c5aba3e25c54fbdf846 upstream. If the beacon head attribute (NL80211_ATTR_BEACON_HEAD) is too short to even contain the frame control field, we access uninitialized data beyond the buffer. Fix this by checking the minimal required size first. We used to do this until S1G support was added, where the fixed data portion has a different size. Reported-and-tested-by: syzbot+72b99dcf4607e8c770f3@syzkaller.appspotmail.com Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: `1d47f1198d` ("nl80211: correctly validate S1G beacon head") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20210408154518.d9b06d39b4ee.Iff908997b2a4067e8d456b3cb96cab9771d252b8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Vlad Buslov	81692c6add	net: sched: fix action overwrite reference counting commit 87c750e8c38bce706eb32e4d8f1e3402f2cebbd4 upstream. Action init code increments reference counter when it changes an action. This is the desired behavior for cls API which needs to obtain action reference for every classifier that points to action. However, act API just needs to change the action and releases the reference before returning. This sequence breaks when the requested action doesn't exist, which causes act API init code to create new action with specified index, but action is still released before returning and is deleted (unless it was referenced concurrently by cls API). Reproduction: $ sudo tc actions ls action gact $ sudo tc actions change action gact drop index 1 $ sudo tc actions ls action gact Extend tcf_action_init() to accept 'init_res' array and initialize it with action->ops->init() result. In tcf_action_add() remove pointers to created actions from actions array before passing it to tcf_action_put_many(). Fixes: `cae422f379` ("net: sched: use reference counting action init") Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Pavel Tikhomirov	cdcf3829f4	net: sched: sch_teql: fix null-pointer dereference commit 1ffbc7ea91606e4abd10eb60de5367f1c86daf5e upstream. Reproduce: modprobe sch_teql tc qdisc add dev teql0 root teql0 This leads to (for instance in Centos 7 VM) OOPS: [ 532.366633] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 532.366733] IP: [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.366825] PGD 80000001376d5067 PUD 137e37067 PMD 0 [ 532.366906] Oops: 0000 [#1] SMP [ 532.366987] Modules linked in: sch_teql ... [ 532.367945] CPU: 1 PID: 3026 Comm: tc Kdump: loaded Tainted: G ------------ T 3.10.0-1062.7.1.el7.x86_64 #1 [ 532.368041] Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.2 04/01/2014 [ 532.368125] task: ffff8b7d37d31070 ti: ffff8b7c9fdbc000 task.ti: ffff8b7c9fdbc000 [ 532.368224] RIP: 0010:[<ffffffffc06124a8>] [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.368320] RSP: 0018:ffff8b7c9fdbf8e0 EFLAGS: 00010286 [ 532.368394] RAX: ffffffffc0612490 RBX: ffff8b7cb1565e00 RCX: ffff8b7d35ba2000 [ 532.368476] RDX: ffff8b7d35ba2000 RSI: 0000000000000000 RDI: ffff8b7cb1565e00 [ 532.368557] RBP: ffff8b7c9fdbf8f8 R08: ffff8b7d3fd1f140 R09: ffff8b7d3b001600 [ 532.368638] R10: ffff8b7d3b001600 R11: ffffffff84c7d65b R12: 00000000ffffffd8 [ 532.368719] R13: 0000000000008000 R14: ffff8b7d35ba2000 R15: ffff8b7c9fdbf9a8 [ 532.368800] FS: 00007f6a4e872740(0000) GS:ffff8b7d3fd00000(0000) knlGS:0000000000000000 [ 532.368885] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 532.368961] CR2: 00000000000000a8 CR3: 00000001396ee000 CR4: 00000000000206e0 [ 532.369046] Call Trace: [ 532.369159] [<ffffffff84c8192e>] qdisc_create+0x36e/0x450 [ 532.369268] [<ffffffff846a9b49>] ? ns_capable+0x29/0x50 [ 532.369366] [<ffffffff849afde2>] ? nla_parse+0x32/0x120 [ 532.369442] [<ffffffff84c81b4c>] tc_modify_qdisc+0x13c/0x610 [ 532.371508] [<ffffffff84c693e7>] rtnetlink_rcv_msg+0xa7/0x260 [ 532.372668] [<ffffffff84907b65>] ? sock_has_perm+0x75/0x90 [ 532.373790] [<ffffffff84c69340>] ? rtnl_newlink+0x890/0x890 [ 532.374914] [<ffffffff84c8da7b>] netlink_rcv_skb+0xab/0xc0 [ 532.376055] [<ffffffff84c63708>] rtnetlink_rcv+0x28/0x30 [ 532.377204] [<ffffffff84c8d400>] netlink_unicast+0x170/0x210 [ 532.378333] [<ffffffff84c8d7a8>] netlink_sendmsg+0x308/0x420 [ 532.379465] [<ffffffff84c2f3a6>] sock_sendmsg+0xb6/0xf0 [ 532.380710] [<ffffffffc034a56e>] ? __xfs_filemap_fault+0x8e/0x1d0 [xfs] [ 532.381868] [<ffffffffc034a75c>] ? xfs_filemap_fault+0x2c/0x30 [xfs] [ 532.383037] [<ffffffff847ec23a>] ? __do_fault.isra.61+0x8a/0x100 [ 532.384144] [<ffffffff84c30269>] ___sys_sendmsg+0x3e9/0x400 [ 532.385268] [<ffffffff847f3fad>] ? handle_mm_fault+0x39d/0x9b0 [ 532.386387] [<ffffffff84d88678>] ? __do_page_fault+0x238/0x500 [ 532.387472] [<ffffffff84c31921>] __sys_sendmsg+0x51/0x90 [ 532.388560] [<ffffffff84c31972>] SyS_sendmsg+0x12/0x20 [ 532.389636] [<ffffffff84d8dede>] system_call_fastpath+0x25/0x2a [ 532.390704] [<ffffffff84d8de21>] ? system_call_after_swapgs+0xae/0x146 [ 532.391753] Code: 00 00 00 00 00 00 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b b7 48 01 00 00 48 89 fb <48> 8b 8e a8 00 00 00 48 85 c9 74 43 48 89 ca eb 0f 0f 1f 80 00 [ 532.394036] RIP [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql] [ 532.395127] RSP <ffff8b7c9fdbf8e0> [ 532.396179] CR2: 00000000000000a8 Null pointer dereference happens on master->slaves dereference in teql_destroy() as master is null-pointer. When qdisc_create() calls teql_qdisc_init() it imediately fails after check "if (m->dev == dev)" because both devices are teql0, and it does not set qdisc_priv(sch)->m leaving it zero on error path, then qdisc_create() imediately calls teql_destroy() which does not expect zero master pointer and we get OOPS. Fixes: `87b60cfacf` ("net_sched: fix error recovery at qdisc creation") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:02 +02:00
Eli Cohen	422eda6255	vdpa/mlx5: Fix suspend/resume index restoration commit bc04d93ea30a0a8eb2a2648b848cef35d1f6f798 upstream. When we suspend the VM, the VDPA interface will be reset. When the VM is resumed again, clear_virtqueues() will clear the available and used indices resulting in hardware virqtqueue objects becoming out of sync. We can avoid this function alltogether since qemu will clear them if required, e.g. when the VM went through a reboot. Moreover, since the hw available and used indices should always be identical on query and should be restored to the same value same value for virtqueues that complete in order, we set the single value provided by set_vq_state(). In get_vq_state() we return the value of hardware used index. Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map") Fixes: `1a86b377aa` ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210408091047.4269-6-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Arkadiusz Kubalewski	89e406e952	i40e: Fix sparse errors in i40e_txrx.c commit 12738ac4754ec92a6a45bf3677d8da780a1412b3 upstream. Remove error handling through pointers. Instead use plain int to return value from i40e_run_xdp(...). Previously: - sparse errors were produced during compilation: i40e_txrx.c:2338 i40e_run_xdp() error: (-2147483647) too low for ERR_PTR i40e_txrx.c:2558 i40e_clean_rx_irq() error: 'skb' dereferencing possible ERR_PTR() - sk_buff* was used to return value, but it has never had valid pointer to sk_buff. Returned value was always int handled as a pointer. Fixes: `0c8493d90b` ("i40e: add XDP support for pass and drop actions") Fixes: `2e68931238` ("i40e: split XDP_TX tail and XDP_REDIRECT map flushing") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Arkadiusz Kubalewski	12e1438a09	i40e: Fix sparse error: uninitialized symbol 'ring' commit d6d04ee6d2c9bb5084c8f6074195d6aa0024e825 upstream. Init pointer with NULL in default switch case statement. Previously the error was produced when compiling against sparse. i40e_debugfs.c:582 i40e_dbg_dump_desc() error: uninitialized symbol 'ring'. Fixes: `44ea803e2f` ("i40e: introduce new dump desc XDP command") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Arkadiusz Kubalewski	2472ba1c46	i40e: Fix sparse error: 'vsi->netdev' could be null commit 6b5674fe6b9bf05394886ebcec62b2d7dae88c42 upstream. Remove vsi->netdev->name from the trace. This is redundant information. With the devinfo trace, the adapter is already identifiable. Previously following error was produced when compiling against sparse. i40e_main.c:2571 i40e_sync_vsi_filters() error: we previously assumed 'vsi->netdev' could be null (see line 2323) Fixes: `b603f9dc20` ("i40e: Log info when PF is entering and leaving Allmulti mode.") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Arkadiusz Kubalewski	7923871182	i40e: Fix sparse warning: missing error code 'err' commit 8a1e918d833ca5c391c4ded5dc006e2d1ce6d37c upstream. Set proper return values inside error checking if-statements. Previously following warning was produced when compiling against sparse. i40e_main.c:15162 i40e_init_recovery_mode() warn: missing error code 'err' Fixes: `4ff0ee1af0` ("i40e: Introduce recovery mode support") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Eric Dumazet	f0b4c9acf5	net: ensure mac header is set in virtio_net_hdr_to_skb() commit 61431a5907fc36d0738e9a547c7e1556349a03e9 upstream. Commit 924a9bc362a5 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct") added a call to dev_parse_header_protocol() but mac_header is not yet set. This means that eth_hdr() reads complete garbage, and syzbot complained about it [1] This patch resets mac_header earlier, to get more coverage about this change. Audit of virtio_net_hdr_to_skb() callers shows that this change should be safe. [1] BUG: KASAN: use-after-free in eth_header_parse_protocol+0xdc/0xe0 net/ethernet/eth.c:282 Read of size 2 at addr ffff888017a6200b by task syz-executor313/8409 CPU: 1 PID: 8409 Comm: syz-executor313 Not tainted 5.12.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416 eth_header_parse_protocol+0xdc/0xe0 net/ethernet/eth.c:282 dev_parse_header_protocol include/linux/netdevice.h:3177 [inline] virtio_net_hdr_to_skb.constprop.0+0x99d/0xcd0 include/linux/virtio_net.h:83 packet_snd net/packet/af_packet.c:2994 [inline] packet_sendmsg+0x2325/0x52b0 net/packet/af_packet.c:3031 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 sock_no_sendpage+0xf3/0x130 net/core/sock.c:2860 kernel_sendpage.part.0+0x1ab/0x350 net/socket.c:3631 kernel_sendpage net/socket.c:3628 [inline] sock_sendpage+0xe5/0x140 net/socket.c:947 pipe_to_sendpage+0x2ad/0x380 fs/splice.c:364 splice_from_pipe_feed fs/splice.c:418 [inline] __splice_from_pipe+0x43e/0x8a0 fs/splice.c:562 splice_from_pipe fs/splice.c:597 [inline] generic_splice_sendpage+0xd4/0x140 fs/splice.c:746 do_splice_from fs/splice.c:767 [inline] do_splice+0xb7e/0x1940 fs/splice.c:1079 __do_splice+0x134/0x250 fs/splice.c:1144 __do_sys_splice fs/splice.c:1350 [inline] __se_sys_splice fs/splice.c:1332 [inline] __x64_sys_splice+0x198/0x250 fs/splice.c:1332 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 Fixes: 924a9bc362a5 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Balazs Nemeth <bnemeth@redhat.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
John Fastabend	72c5de25ba	bpf, sockmap: Fix incorrect fwd_alloc accounting commit 144748eb0c445091466c9b741ebd0bfcc5914f3d upstream. Incorrect accounting fwd_alloc can result in a warning when the socket is torn down, [18455.319240] WARNING: CPU: 0 PID: 24075 at net/core/stream.c:208 sk_stream_kill_queues+0x21f/0x230 [...] [18455.319543] Call Trace: [18455.319556] inet_csk_destroy_sock+0xba/0x1f0 [18455.319577] tcp_rcv_state_process+0x1b4e/0x2380 [18455.319593] ? lock_downgrade+0x3a0/0x3a0 [18455.319617] ? tcp_finish_connect+0x1e0/0x1e0 [18455.319631] ? sk_reset_timer+0x15/0x70 [18455.319646] ? tcp_schedule_loss_probe+0x1b2/0x240 [18455.319663] ? lock_release+0xb2/0x3f0 [18455.319676] ? __release_sock+0x8a/0x1b0 [18455.319690] ? lock_downgrade+0x3a0/0x3a0 [18455.319704] ? lock_release+0x3f0/0x3f0 [18455.319717] ? __tcp_close+0x2c6/0x790 [18455.319736] ? tcp_v4_do_rcv+0x168/0x370 [18455.319750] tcp_v4_do_rcv+0x168/0x370 [18455.319767] __release_sock+0xbc/0x1b0 [18455.319785] __tcp_close+0x2ee/0x790 [18455.319805] tcp_close+0x20/0x80 This currently happens because on redirect case we do skb_set_owner_r() with the original sock. This increments the fwd_alloc memory accounting on the original sock. Then on redirect we may push this into the queue of the psock we are redirecting to. When the skb is flushed from the queue we give the memory back to the original sock. The problem is if the original sock is destroyed/closed with skbs on another psocks queue then the original sock will not have a way to reclaim the memory before being destroyed. Then above warning will be thrown sockA sockB sk_psock_strp_read() sk_psock_verdict_apply() -- SK_REDIRECT -- sk_psock_skb_redirect() skb_queue_tail(psock_other->ingress_skb..) sk_close() sock_map_unref() sk_psock_put() sk_psock_drop() sk_psock_zap_ingress() At this point we have torn down our own psock, but have the outstanding skb in psock_other. Note that SK_PASS doesn't have this problem because the sk_psock_drop() logic releases the skb, its still associated with our psock. To resolve lets only account for sockets on the ingress queue that are still associated with the current socket. On the redirect case we will check memory limits per `6fa9201a89`, but will omit fwd_alloc accounting until skb is actually enqueued. When the skb is sent via skb_send_sock_locked or received with sk_psock_skb_ingress memory will be claimed on psock_other. Fixes: `6fa9201a89` ("bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/161731444013.68884.4021114312848535993.stgit@john-XPS-13-9370 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
John Fastabend	00c01de1a9	bpf, sockmap: Fix sk->prot unhash op reset commit 1c84b33101c82683dee8b06761ca1f69e78c8ee7 upstream. In '4da6a196f93b1' we fixed a potential unhash loop caused when a TLS socket in a sockmap was removed from the sockmap. This happened because the unhash operation on the TLS ctx continued to point at the sockmap implementation of unhash even though the psock has already been removed. The sockmap unhash handler when a psock is removed does the following, void sock_map_unhash(struct sock sk) { void (saved_unhash)(struct sock sk); struct sk_psock psock; rcu_read_lock(); psock = sk_psock(sk); if (unlikely(!psock)) { rcu_read_unlock(); if (sk->sk_prot->unhash) sk->sk_prot->unhash(sk); return; } [...] } The unlikely() case is there to handle the case where psock is detached but the proto ops have not been updated yet. But, in the above case with TLS and removed psock we never fixed sk_prot->unhash() and unhash() points back to sock_map_unhash resulting in a loop. To fix this we added this bit of code, static inline void sk_psock_restore_proto(struct sock sk, struct sk_psock psock) { sk->sk_prot->unhash = psock->saved_unhash; This will set the sk_prot->unhash back to its saved value. This is the correct callback for a TLS socket that has been removed from the sock_map. Unfortunately, this also overwrites the unhash pointer for all psocks. We effectively break sockmap unhash handling for any future socks. Omitting the unhash operation will leave stale entries in the map if a socket transition through unhash, but does not do close() op. To fix set unhash correctly before calling into tls_update. This way the TLS enabled socket will point to the saved unhash() handler. Fixes: `4da6a196f9` ("bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loop") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Reported-by: Lorenz Bauer <lmb@cloudflare.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/161731441904.68884.15593917809745631972.stgit@john-XPS-13-9370 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Dave Marchevsky	d921baabd9	bpf: Refcount task stack in bpf_get_task_stack commit 06ab134ce8ecfa5a69e850f88f81c8a4c3fa91df upstream. On x86 the struct pt_regs * grabbed by task_pt_regs() points to an offset of task->stack. The pt_regs are later dereferenced in __bpf_get_stack (e.g. by user_mode() check). This can cause a fault if the task in question exits while bpf_get_task_stack is executing, as warned by task_stack_page's comment: * When accessing the stack of a non-current task that might exit, use * try_get_task_stack() instead. task_stack_page will return a pointer * that could get freed out from under you. Taking the comment's advice and using try_get_task_stack() and put_task_stack() to hold task->stack refcount, or bail early if it's already 0. Incrementing stack_refcount will ensure the task's stack sticks around while we're using its data. I noticed this bug while testing a bpf task iter similar to bpf_iter_task_stack in selftests, except mine grabbed user stack, and getting intermittent crashes, which resulted in dumps like: BUG: unable to handle page fault for address: 0000000000003fe0 \#PF: supervisor read access in kernel mode \#PF: error_code(0x0000) - not-present page RIP: 0010:__bpf_get_stack+0xd0/0x230 <snip...> Call Trace: bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec bpf_iter_run_prog+0x24/0x81 __task_seq_show+0x58/0x80 bpf_seq_read+0xf7/0x3d0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x48/0x120 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `fa28dcb82a` ("bpf: Introduce helper bpf_get_task_stack()") Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210401000747.3648767-1-davemarchevsky@fb.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Ciara Loftus	caef780614	libbpf: Only create rx and tx XDP rings when necessary commit ca7a83e2487ad0bc9a3e0e7a8645354aa1782f13 upstream. Prior to this commit xsk_socket__create(_shared) always attempted to create the rx and tx rings for the socket. However this causes an issue when the socket being setup is that which shares the fd with the UMEM. If a previous call to this function failed with this socket after the rings were set up, a subsequent call would always fail because the rings are not torn down after the first call and when we try to set them up again we encounter an error because they already exist. Solve this by remembering whether the rings were set up by introducing new bools to struct xsk_umem which represent the ring setup status and using them to determine whether or not to set up the rings. Fixes: `1cad078842` ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-4-ciara.loftus@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Ciara Loftus	4cc9177b09	libbpf: Restore umem state after socket create failure commit 43f1bc1efff16f553dd573d02eb7a15750925568 upstream. If the call to xsk_socket__create fails, the user may want to retry the socket creation using the same umem. Ensure that the umem is in the same state on exit if the call fails by: 1. ensuring the umem _save pointers are unmodified. 2. not unmapping the set of umem rings that were set up with the umem during xsk_umem__create, since those maps existed before the call to xsk_socket__create and should remain in tact even in the event of failure. Fixes: `2f6324a393` ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-3-ciara.loftus@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:01 +02:00
Ciara Loftus	5aa7df1722	libbpf: Ensure umem pointer is non-NULL before dereferencing commit df662016310aa4475d7986fd726af45c8fe4f362 upstream. Calls to xsk_socket__create dereference the umem to access the fill_save and comp_save pointers. Make sure the umem is non-NULL before doing this. Fixes: `2f6324a393` ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20210331061218.1647-2-ciara.loftus@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Lv Yunlong	b52e88638f	ethernet/netronome/nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx commit 6e5a03bcba44e080a6bf300194a68ce9bb1e5184 upstream. In nfp_bpf_ctrl_msg_rx, if nfp_ccm_get_type(skb) == NFP_CCM_TYPE_BPF_BPF_EVENT is true, the skb will be freed. But the skb is still used by nfp_ccm_rx(&bpf->ccm, skb). My patch adds a return when the skb was freed. Fixes: `bcf0cafab4` ("nfp: split out common control message handling code") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Lorenz Bauer	d86046a775	bpf: link: Refuse non-O_RDWR flags in BPF_OBJ_GET commit 25fc94b2f02d832fa8e29419699dcc20b0b05c6a upstream. Invoking BPF_OBJ_GET on a pinned bpf_link checks the path access permissions based on file_flags, but the returned fd ignores flags. This means that any user can acquire a "read-write" fd for a pinned link with mode 0664 by invoking BPF_OBJ_GET with BPF_F_RDONLY in file_flags. The fd can be used to invoke BPF_LINK_DETACH, etc. Fix this by refusing non-O_RDWR flags in BPF_OBJ_GET. This works because OBJ_GET by default returns a read write mapping and libbpf doesn't expose a way to override this behaviour for programs and links. Fixes: `70ed506c3b` ("bpf: Introduce pinnable bpf_link abstraction") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210326160501.46234-1-lmb@cloudflare.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Toke Høiland-Jørgensen	b7004ecafa	bpf: Enforce that struct_ops programs be GPL-only commit 12aa8a9467b354ef893ce0fc5719a4de4949a9fb upstream. With the introduction of the struct_ops program type, it became possible to implement kernel functionality in BPF, making it viable to use BPF in place of a regular kernel module for these particular operations. Thus far, the only user of this mechanism is for implementing TCP congestion control algorithms. These are clearly marked as GPL-only when implemented as modules (as seen by the use of EXPORT_SYMBOL_GPL for tcp_register_congestion_control()), so it seems like an oversight that this was not carried over to BPF implementations. Since this is the only user of the struct_ops mechanism, just enforcing GPL-only for the struct_ops program type seems like the simplest way to fix this. Fixes: `0baf26b0fc` ("bpf: tcp: Support tcp_congestion_ops in bpf") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210326100314.121853-1-toke@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Pedro Tammela	3015db3de7	libbpf: Fix bail out from 'ringbuf_process_ring()' on error commit 6032ebb54c60cae24329f6aba3ce0c1ca8ad6abe upstream. The current code bails out with negative and positive returns. If the callback returns a positive return code, 'ring_buffer__consume()' and 'ring_buffer__poll()' will return a spurious number of records consumed, but mostly important will continue the processing loop. This patch makes positive returns from the callback a no-op. Fixes: `bf99c936f9` ("libbpf: Add BPF ring buffer support") Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325150115.138750-1-pctammela@mojatatu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Anirudh Rayabharam	dc195928d7	net: hso: fix null-ptr-deref during tty device unregistration commit 8a12f8836145ffe37e9c8733dce18c22fb668b66 upstream. Multiple ttys try to claim the same the minor number causing a double unregistration of the same device. The first unregistration succeeds but the next one results in a null-ptr-deref. The get_free_serial_index() function returns an available minor number but doesn't assign it immediately. The assignment is done by the caller later. But before this assignment, calls to get_free_serial_index() would return the same minor number. Fix this by modifying get_free_serial_index to assign the minor number immediately after one is found to be and rename it to obtain_minor() to better reflect what it does. Similary, rename set_serial_by_index() to release_minor() and modify it to free up the minor number of the given hso_serial. Every obtain_minor() should have corresponding release_minor() call. Fixes: `72dc1c096c` ("HSO: add option hso driver") Reported-by: syzbot+c49fe6089f295a05e6f8@syzkaller.appspotmail.com Tested-by: syzbot+c49fe6089f295a05e6f8@syzkaller.appspotmail.com Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Yongxin Liu	c2743e0a63	ice: fix memory leak of aRFS after resuming from suspend commit 1831da7ea5bdf5531d78bcf81f526faa4c4375fa upstream. In ice_suspend(), ice_clear_interrupt_scheme() is called, and then irq_free_descs() will be eventually called to free irq and its descriptor. In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe cannot be freed, if the irqs that released in ice_suspend() were reassigned to other devices, which makes irq descriptor's affinity_notify lost. So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which can make sure all irq_glue and cpu_rmap can be correctly released before corresponding irq and descriptor are released. Fix the following memory leak. unreferenced object 0xffff95bd951afc00 (size 512): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p....... 00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................ backtrace: [<0000000072e4b914>] __kmalloc+0x336/0x540 [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0 [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object 0xffff95bd81b0a2a0 (size 96): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8............... b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................ backtrace: [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0 [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0 [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 Fixes: `769c500dcc` ("ice: Add advanced power mgmt for WoL") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Johannes Berg	6bd4e82292	iwlwifi: pcie: properly set LTR workarounds on 22000 devices commit 25628bc08d4526d3673ca7d039eb636aa9006076 upstream. As the context info gen3 code is only called for >=AX210 devices (from iwl_trans_pcie_gen2_start_fw()) the code there to set LTR on 22000 devices cannot actually do anything (22000 < AX210). Fix this by moving the LTR code to iwl_trans_pcie_gen2_start_fw() where it can handle both devices. This then requires that we kick the firmware only after that rather than doing it from the context info code. Note that this again had a dead branch in gen3 code, which I've removed here. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Fixes: ed0022da8bd9 ("iwlwifi: pcie: set LTR on more devices") Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/iwlwifi.20210326125611.675486178ed1.Ib61463aba6920645059e366dcdca4c4c77f0ff58@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Robert Malz	e5386e87f8	ice: Cleanup fltr list in case of allocation issues commit b7eeb52721fe417730fc5adc5cbeeb5fe349ab26 upstream. When ice_remove_vsi_lkup_fltr is called, by calling ice_add_to_vsi_fltr_list local copy of vsi filter list is created. If any issues during creation of vsi filter list occurs it up for the caller to free already allocated memory. This patch ensures proper memory deallocation in these cases. Fixes: `80d144c9ac` ("ice: Refactor switch rule management structures and functions") Signed-off-by: Robert Malz <robertx.malz@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Anirudh Venkataramanan	9d1c342c50	ice: Use port number instead of PF ID for WoL commit 3176551979b92b02756979c0f1e2d03d1fc82b1e upstream. As per the spec, the WoL control word read from the NVM should be interpreted as port numbers, and not PF numbers. So when checking if WoL supported, use the port number instead of the PF ID. Also, ice_is_wol_supported doesn't really need a pointer to the pf struct, but just needs a pointer to the hw instance. Fixes: `769c500dcc` ("ice: Add advanced power mgmt for WoL") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:42:00 +02:00
Jacek Bułatek	b696861102	ice: Fix for dereference of NULL pointer commit 7a91d3f02b04b2fb18c2dfa8b6c4e5a40a2753f5 upstream. Add handling of allocation fault for ice_vsi_list_map_info. Also *fi should not be NULL pointer, it is a reference to raw data field, so remove this variable and use the reference directly. Fixes: `9daf8208dd` ("ice: Add support for switch filter programming") Signed-off-by: Jacek Bułatek <jacekx.bulatek@intel.com> Co-developed-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Dave Ertman	4d73a6143d	ice: remove DCBNL_DEVRESET bit from PF state commit 741b7b743bbcb5a3848e4e55982064214f900d2f upstream. The original purpose of the ICE_DCBNL_DEVRESET was to protect the driver during DCBNL device resets. But, the flow for DCBNL device resets now consists of only calls up the stack such as dev_close() and dev_open() that will result in NDO calls to the driver. These will be handled with state changes from the stack. Also, there is a problem of the dev_close and dev_open being blocked by checks for reset in progress also using the ICE_DCBNL_DEVRESET bit. Since the ICE_DCBNL_DEVRESET bit is not necessary for protecting the driver from DCBNL device resets and it is actually blocking changes coming from the DCBNL interface, remove the bit from the PF state and don't block driver function based on DCBNL reset in progress. Fixes: `b94b013eb6` ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Bruce Allan	286830a846	ice: fix memory allocation call commit 59df14f9cc2326bd6432d60eca0df8201d9d3d4b upstream. Fix the order of number of array members and member size parameters in a *calloc() call. Fixes: `b3c3890489` ("ice: avoid unnecessary single-member variable-length structs") Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Krzysztof Goreczny	4686a26e95	ice: prevent ice_open and ice_stop during reset commit e95fc8573e07c5e4825df4650fd8b8c93fad27a7 upstream. There is a possibility of race between ice_open or ice_stop calls performed by OS and reset handling routine both trying to modify VSI resources. Observed scenarios: - reset handler deallocates memory in ice_vsi_free_arrays and ice_open tries to access it in ice_vsi_cfg_txq leading to driver crash - reset handler deallocates memory in ice_vsi_free_arrays and ice_close tries to access it in ice_down leading to driver crash - reset handler clears port scheduler topology and sets port state to ICE_SCHED_PORT_STATE_INIT leading to ice_ena_vsi_txq fail in ice_open To prevent this additional checks in ice_open and ice_stop are introduced to make sure that OS is not allowed to alter VSI config while reset is in progress. Fixes: `cdedef59de` ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Fabio Pricoco	ef7ed8c77d	ice: Increase control queue timeout commit f88c529ac77b3c21819d2cf1dfcfae1937849743 upstream. 250 msec timeout is insufficient for some AQ commands. Advice from FW team was to increase the timeout. Increase to 1 second. Fixes: `7ec59eeac8` ("ice: Add support for control queues") Signed-off-by: Fabio Pricoco <fabio.pricoco@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Anirudh Venkataramanan	6590b7bfbc	ice: Continue probe on link/PHY errors commit 08771bce330036d473be6ce851cd00bcd351ebf6 upstream. An incorrect NVM update procedure can result in the driver failing probe. In this case, the recommended resolution method is to update the NVM using the right procedure. However, if the driver fails probe, the user will not be able to update the NVM. So do not fail probe on link/PHY errors. Fixes: `1a3571b593` ("ice: restore PHY settings on media insertion") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Tetsuo Handa	9a7bc0c403	batman-adv: initialize "struct batadv_tvlv_tt_vlan_data"->reserved field commit 08c27f3322fec11950b8f1384aa0f3b11d028528 upstream. KMSAN found uninitialized value at batadv_tt_prepare_tvlv_local_data() [1], for commit `ced72933a5` ("batman-adv: use CRC32C instead of CRC16 in TT code") inserted 'reserved' field into "struct batadv_tvlv_tt_data" and commit `7ea7b4a142` ("batman-adv: make the TT CRC logic VLAN specific") moved that field to "struct batadv_tvlv_tt_vlan_data" but left that field uninitialized. [1] https://syzkaller.appspot.com/bug?id=07f3e6dba96f0eb3cabab986adcd8a58b9bdbe9d Reported-by: syzbot <syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com> Tested-by: syzbot <syzbot+50ee810676e6a089487b@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: `ced72933a5` ("batman-adv: use CRC32C instead of CRC16 in TT code") Fixes: `7ea7b4a142` ("batman-adv: make the TT CRC logic VLAN specific") Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Marek Behún	d1173effc5	ARM: dts: turris-omnia: configure LED[2]/INTn pin as interrupt pin commit a26c56ae67fa9fbb45a8a232dcd7ebaa7af16086 upstream. Use the `marvell,reg-init` DT property to configure the LED[2]/INTn pin of the Marvell 88E1514 ethernet PHY on Turris Omnia into interrupt mode. Without this the pin is by default in LED[2] mode, and the Marvell PHY driver configures LED[2] into "On - Link, Blink - Activity" mode. This fixes the issue where the pca9538 GPIO/interrupt controller (which can't mask interrupts in HW) received too many interrupts and after a time started ignoring the interrupt with error message: IRQ 71: nobody cared There is a work in progress to have the Marvell PHY driver support parsing PHY LED nodes from OF and registering the LEDs as Linux LED class devices. Once this is done the PHY driver can also automatically set the pin into INTn mode if it does not find LED[2] in OF. Until then, though, we fix this via `marvell,reg-init` DT property. Signed-off-by: Marek Behún <kabel@kernel.org> Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Fixes: `26ca8b52d6` ("ARM: dts: add support for Turris Omnia") Cc: Uwe Kleine-König <uwe@kleine-koenig.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Andrew Lunn <andrew@lunn.ch> Cc: Gregory CLEMENT <gregory.clement@bootlin.com> Cc: <stable@vger.kernel.org> Tested-by: Rui Salvaterra <rsalvaterra@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Gao Xiang	4941889535	parisc: avoid a warning on u8 cast for cmpxchg on u8 pointers commit 4d752e5af63753ab5140fc282929b98eaa4bd12e upstream. commit `b344d6a83d` ("parisc: add support for cmpxchg on u8 pointers") can generate a sparse warning ("cast truncates bits from constant value"), which has been reported several times [1] [2] [3]. The original code worked as expected, but anyway, let silence such sparse warning as what others did [4]. [1] https://lore.kernel.org/r/202104061220.nRMBwCXw-lkp@intel.com [2] https://lore.kernel.org/r/202012291914.T5Agcn99-lkp@intel.com [3] https://lore.kernel.org/r/202008210829.KVwn7Xeh%25lkp@intel.com [4] https://lore.kernel.org/r/20210315131512.133720-2-jacopo+renesas@jmondi.org Cc: Liam Beguin <liambeguin@gmail.com> Cc: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Helge Deller	597121792e	parisc: parisc-agp requires SBA IOMMU driver commit 9054284e8846b0105aad43a4e7174ca29fffbc44 upstream. Add a dependency to the SBA IOMMU driver to avoid: ERROR: modpost: "sba_list" [drivers/char/agp/parisc-agp.ko] undefined! Reported-by: kernel test robot <lkp@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:59 +02:00
Ilya Lipnitskiy	9b54dad28d	of: property: fw_devlink: do not link ".,nr-gpios" commit d473d32c2fbac2d1d7082c61899cfebd34eb267a upstream. [<vendor>,]nr-gpios property is used by some GPIO drivers[0] to indicate the number of GPIOs present on a system, not define a GPIO. nr-gpios is not configured by #gpio-cells and can't be parsed along with other "-gpios" properties. nr-gpios without the "<vendor>," prefix is not allowed by the DT spec[1], so only add exception for the ",nr-gpios" suffix and let the error message continue being printed for non-compliant implementations. [0] nr-gpios is referenced in Documentation/devicetree/bindings/gpio: - gpio-adnp.txt - gpio-xgene-sb.txt - gpio-xlp.txt - snps,dw-apb-gpio.yaml [1] Link: `cb53a16a1e/schemas/gpio/gpio-consumer.yaml (L20)` Fixes errors such as: OF: /palmbus@300000/gpio@600: could not find phandle Fixes: `7f00be96f1` ("of: property: Add device link support for interrupt-parent, dmas and -gpio(s)") Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> Cc: Saravana Kannan <saravanak@google.com> Cc: stable@vger.kernel.org # v5.5+ Link: https://lore.kernel.org/r/20210405222540.18145-1-ilya.lipnitskiy@gmail.com Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:58 +02:00
Wong Vee Khee	009c566527	ethtool: fix incorrect datatype in set_eee ops commit 63cf32389925e234d166fb1a336b46de7f846003 upstream. The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool header file. Hence, we should use ethnl_update_u32() in set_eee ops. Fixes: `fd77be7bd4` ("ethtool: set EEE settings with EEE_SET request") Cc: <stable@vger.kernel.org> # 5.10.x Cc: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:58 +02:00
Jack Qiu	3a675c1b50	fs: direct-io: fix missing sdio->boundary commit df41872b68601059dd4a84858952dcae58acd331 upstream. I encountered a hung task issue, but not a performance one. I run DIO on a device (need lba continuous, for example open channel ssd), maybe hungtask in below case: DIO: Checkpoint: get addr A(at boundary), merge into BIO, no submit because boundary missing flush dirty data(get addr A+1), wait IO(A+1) writeback timeout, because DIO(A) didn't submit get addr A+2 fail, because checkpoint is doing dio_send_cur_page() may clear sdio->boundary, so prevent it from missing a boundary. Link: https://lkml.kernel.org/r/20210322042253.38312-1-jack.qiu@huawei.com Fixes: `b1058b9812` ("direct-io: submit bio after boundary buffer is added to it") Signed-off-by: Jack Qiu <jack.qiu@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:58 +02:00
Wengang Wang	b1a5122554	ocfs2: fix deadlock between setattr and dio_end_io_write commit 90bd070aae6c4fb5d302f9c4b9c88be60c8197ec upstream. The following deadlock is detected: truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write). PID: 14827 TASK: ffff881686a9af80 CPU: 20 COMMAND: "ora_p005_hrltd9" #0 __schedule at ffffffff818667cc #1 schedule at ffffffff81866de6 #2 inode_dio_wait at ffffffff812a2d04 #3 ocfs2_setattr at ffffffffc05f322e [ocfs2] #4 notify_change at ffffffff812a5a09 #5 do_truncate at ffffffff812808f5 #6 do_sys_ftruncate.constprop.18 at ffffffff81280cf2 #7 sys_ftruncate at ffffffff81280d8e #8 do_syscall_64 at ffffffff81003949 #9 entry_SYSCALL_64_after_hwframe at ffffffff81a001ad dio completion path is going to complete one direct IO (decrement inode->i_dio_count), but before that it hung at locking inode->i_rwsem: #0 __schedule+700 at ffffffff818667cc #1 schedule+54 at ffffffff81866de6 #2 rwsem_down_write_failed+536 at ffffffff8186aa28 #3 call_rwsem_down_write_failed+23 at ffffffff8185a1b7 #4 down_write+45 at ffffffff81869c9d #5 ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2] #6 ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2] #7 dio_complete+140 at ffffffff812c873c #8 dio_aio_complete_work+25 at ffffffff812c89f9 #9 process_one_work+361 at ffffffff810b1889 #10 worker_thread+77 at ffffffff810b233d #11 kthread+261 at ffffffff810b7fd5 #12 ret_from_fork+62 at ffffffff81a0035e Thus above forms ABBA deadlock. The same deadlock was mentioned in upstream commit `28f5a8a7c0` ("ocfs2: should wait dio before inode lock in ocfs2_setattr()"). It seems that that commit only removed the cluster lock (the victim of above dead lock) from the ABBA deadlock party. End-user visible effects: Process hang in truncate -> ocfs2_setattr path and other processes hang at ocfs2_dio_end_io_write path. This is to fix the deadlock itself. It removes inode_lock() call from dio completion path to remove the deadlock and add ip_alloc_sem lock in setattr path to synchronize the inode modifications. [wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested] Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:58 +02:00
Mike Rapoport	4fabcf2294	nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff commit a3a8833dffb7e7329c2586b8bfc531adb503f123 upstream. Commit `cb9f753a37` ("mm: fix races between swapoff and flush dcache") updated flush_dcache_page implementations on several architectures to use page_mapping_file() in order to avoid races between page_mapping() and swapoff(). This update missed arch/nds32 and there is a possibility of a race there. Replace page_mapping() with page_mapping_file() in nds32 implementation of flush_dcache_page(). Link: https://lkml.kernel.org/r/20210330175126.26500-1-rppt@kernel.org Fixes: `cb9f753a37` ("mm: fix races between swapoff and flush dcache") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Greentime Hu <green.hu@gmail.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:58 +02:00
Sergei Trofimovich	7d9da660af	ia64: fix user_stack_pointer() for ptrace() commit 7ad1e366167837daeb93d0bacb57dee820b0b898 upstream. ia64 has two stacks: - memory stack (or stack), pointed at by by r12 - register backing store (register stack), pointed at by ar.bsp/ar.bspstore with complications around dirty register frame on CPU. In [1] Dmitry noticed that PTRACE_GET_SYSCALL_INFO returns the register stack instead memory stack. The bug comes from the fact that user_stack_pointer() and current_user_stack_pointer() don't return the same register: ulong user_stack_pointer(struct pt_regs *regs) { return regs->ar_bspstore; } #define current_user_stack_pointer() (current_pt_regs()->r12) The change gets both back in sync. I think ptrace(PTRACE_GET_SYSCALL_INFO) is the only affected user by this bug on ia64. The change fixes 'rt_sigreturn.gen.test' strace test where it was observed initially. Link: https://bugs.gentoo.org/769614 [1] Link: https://lkml.kernel.org/r/20210331084447.2561532-1-slyfox@gentoo.org Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Reported-by: Dmitry V. Levin <ldv@altlinux.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:58 +02:00
Nick Desaulniers	8e5bfafedf	gcov: re-fix clang-11+ support commit 9562fd132985ea9185388a112e50f2a51557827d upstream. LLVM changed the expected function signature for llvm_gcda_emit_function() in the clang-11 release. Users of clang-11 or newer may have noticed their kernels producing invalid coverage information: $ llvm-cov gcov -a -c -u -f -b <input>.gcda -- gcno=<input>.gcno 1 <func>: checksum mismatch, \ (<lineno chksum A>, <cfg chksum B>) != (<lineno chksum A>, <cfg chksum C>) 2 Invalid .gcda File! ... Fix up the function signatures so calling this function interprets its parameters correctly and computes the correct cfg checksum. In particular, in clang-11, the additional checksum is no longer optional. Link: https://reviews.llvm.org/rG25544ce2df0daa4304c07e64b9c8b0f7df60c11d Link: https://lkml.kernel.org/r/20210408184631.1156669-1-ndesaulniers@google.com Reported-by: Prasad Sodagudi <psodagud@quicinc.com> Tested-by: Prasad Sodagudi <psodagud@quicinc.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Cc: <stable@vger.kernel.org> [5.4+] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-04-14 08:41:58 +02:00
Al Viro	4390813936	LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late commit 4f0ed93fb92d3528c73c80317509df3f800a222b upstream. That (and traversals in case of umount .) should be done before complete_walk(). Either a braino or mismerge damage on queue reorders - either way, I should've spotted that much earlier. Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk> X-Paperbag: Brown Fixes: `161aff1d93` "LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat()" Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:58 +02:00
Mike Marciniszyn	de427b662b	IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS commit 5de61a47eb9064cbbc5f3360d639e8e34a690a54 upstream. A panic can result when AIP is enabled: BUG: unable to handle kernel NULL pointer dereference at 000000000000000 PGD 0 P4D 0 Oops: 0000 1 SMP PTI CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1 Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014 RIP: 0010:__bitmap_and+0x1b/0x70 RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048 RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750 RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0 R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003 FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0 Call Trace: hfi1_num_netdev_contexts+0x7c/0x110 [hfi1] hfi1_init_dd+0xd7f/0x1a90 [hfi1] ? pci_bus_read_config_dword+0x49/0x70 ? pci_mmcfg_read+0x3e/0xe0 do_init_one.isra.18+0x336/0x640 [hfi1] local_pci_probe+0x41/0x90 pci_device_probe+0x105/0x1c0 really_probe+0x212/0x440 driver_probe_device+0x49/0xc0 device_driver_attach+0x50/0x60 __driver_attach+0x61/0x130 ? device_driver_attach+0x60/0x60 bus_for_each_dev+0x77/0xc0 ? klist_add_tail+0x3b/0x70 bus_add_driver+0x14d/0x1e0 ? dev_init+0x10b/0x10b [hfi1] driver_register+0x6b/0xb0 ? dev_init+0x10b/0x10b [hfi1] hfi1_mod_init+0x1e6/0x20a [hfi1] do_one_initcall+0x46/0x1c3 ? free_unref_page_commit+0x91/0x100 ? _cond_resched+0x15/0x30 ? kmem_cache_alloc_trace+0x140/0x1c0 do_init_module+0x5a/0x220 load_module+0x14b4/0x17e0 ? __do_sys_finit_module+0xa8/0x110 __do_sys_finit_module+0xa8/0x110 do_syscall_64+0x5b/0x1a0 The issue happens when pcibus_to_node() returns NO_NUMA_NODE. Fix this issue by moving the initialization of dd->node to hfi1_devdata allocation and remove the other pcibus_to_node() calls in the probe path and use dd->node instead. Affinity logic is adjusted to use a new field dd->affinity_entry as a guard instead of dd->node. Fixes: `4730f4a6c6` ("IB/hfi1: Activate the dummy netdev") Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessandro@cornelisnetworks.com Cc: stable@vger.kernel.org Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:58 +02:00

1 2 3 4 5 ...

972564 Commits