This reverts commit 7ad813f208 ("net: phy:
Correctly process PHY_HALTED in phy_stop_machine()") because it is
creating the possibility for a NULL pointer dereference.
David Daney provide the following call trace and diagram of events:
When ndo_stop() is called we call:
phy_disconnect()
+---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;
+---> phy_stop_machine()
| +---> phy_state_machine()
| +----> queue_delayed_work(): Work queued.
+--->phy_detach() implies: phydev->attached_dev = NULL;
Now at a later time the queued work does:
phy_state_machine()
+---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:
CPU 12 Unable to handle kernel paging request at virtual address
0000000000000048, epc == ffffffff80de37ec, ra == ffffffff80c7c
Oops[#1]:
CPU: 12 PID: 1502 Comm: kworker/12:1 Not tainted 4.9.43-Cavium-Octeon+ #1
Workqueue: events_power_efficient phy_state_machine
task: 80000004021ed100 task.stack: 8000000409d70000
$ 0 : 0000000000000000 ffffffff84720060 0000000000000048 0000000000000004
$ 4 : 0000000000000000 0000000000000001 0000000000000004 0000000000000000
$ 8 : 0000000000000000 0000000000000000 00000000ffff98f3 0000000000000000
$12 : 8000000409d73fe0 0000000000009c00 ffffffff846547c8 000000000000af3b
$16 : 80000004096bab68 80000004096babd0 0000000000000000 80000004096ba800
$20 : 0000000000000000 0000000000000000 ffffffff81090000 0000000000000008
$24 : 0000000000000061 ffffffff808637b0
$28 : 8000000409d70000 8000000409d73cf0 80000000271bd300 ffffffff80c7804c
Hi : 000000000000002a
Lo : 000000000000003f
epc : ffffffff80de37ec netif_carrier_off+0xc/0x58
ra : ffffffff80c7804c phy_state_machine+0x48c/0x4f8
Status: 14009ce3 KX SX UX KERNEL EXL IE
Cause : 00800008 (ExcCode 02)
BadVA : 0000000000000048
PrId : 000d9501 (Cavium Octeon III)
Modules linked in:
Process kworker/12:1 (pid: 1502, threadinfo=8000000409d70000,
task=80000004021ed100, tls=0000000000000000)
Stack : 8000000409a54000 80000004096bab68 80000000271bd300 80000000271c1e00
0000000000000000 ffffffff808a1708 8000000409a54000 80000000271bd300
80000000271bd320 8000000409a54030 ffffffff80ff0f00 0000000000000001
ffffffff81090000 ffffffff808a1ac0 8000000402182080 ffffffff84650000
8000000402182080 ffffffff84650000 ffffffff80ff0000 8000000409a54000
ffffffff808a1970 0000000000000000 80000004099e8000 8000000402099240
0000000000000000 ffffffff808a8598 0000000000000000 8000000408eeeb00
8000000409a54000 00000000810a1d00 0000000000000000 8000000409d73de8
8000000409d73de8 0000000000000088 000000000c009c00 8000000409d73e08
8000000409d73e08 8000000402182080 ffffffff808a84d0 8000000402182080
...
Call Trace:
[<ffffffff80de37ec>] netif_carrier_off+0xc/0x58
[<ffffffff80c7804c>] phy_state_machine+0x48c/0x4f8
[<ffffffff808a1708>] process_one_work+0x158/0x368
[<ffffffff808a1ac0>] worker_thread+0x150/0x4c0
[<ffffffff808a8598>] kthread+0xc8/0xe0
[<ffffffff808617f0>] ret_from_kernel_thread+0x14/0x1c
The original motivation for this change originated from Marc Gonzales
indicating that his network driver did not have its adjust_link callback
executing with phydev->link = 0 while he was expecting it.
PHYLIB has never made any such guarantees ever because phy_stop() merely just
tells the workqueue to move into PHY_HALTED state which will happen
asynchronously.
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reported-by: David Daney <ddaney.cavm@gmail.com>
Fixes: 7ad813f208 ("net: phy: Correctly process PHY_HALTED in phy_stop_machine()")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>