linux_dsm_epyc7002/drivers/net/ethernet/intel/fm10k
Jacob Keller e72319bba8 fm10k: don't initialize service task until later in probe
Delay initialization of the service timer and service task until late
probe. If we don't wait, failures in probe do not properly cleanup the
service timer or service task items, which results in the kernel panic
below, potentially freezing the whole system. In addition, ensure that
the SERVICE_DISABLE bit is set before we request the MBX IRQ since the
MBX interrupt attempts to schedule the service task otherwise. This
prevents a similar trace from occurring after this change.

We didn't notice this issue before because probe almost always completes
successfully. I discovered it due to a mis-ordered mailbox handler
array, which resulted in the following failure when requesting mailbox
interrupt.

[  555.325619] ------------[ cut here ]------------
[  555.325628] WARNING: CPU: 0 PID: 4941 at lib/list_debug.c:33 __list_add+0xa0/0xd0()
[  555.325631] list_add corruption. prev->next should be next (ffffffff81f46648), but was           (null). (prev=ffff8807fad5d0e8).
<snip>
[  555.325722] CPU: 0 PID: 4941 Comm: insmod Tainted: G           OE   4.0.4-303.fc22.x86_64 #1
[  555.325725] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.8x23.060520140825 06/05/2014
[  555.325727]  0000000000000000 00000000b4f161b3 ffff88081a21f8e8 ffffffff81783124
[  555.325734]  0000000000000000 ffff88081a21f940 ffff88081a21f928 ffffffff8109c66a
[  555.325740]  0000000064000000 ffff8807fad5d0e8 ffff8807fad5d0e8 ffffffff81f46648
[  555.325746] Call Trace:
[  555.325752]  [<ffffffff81783124>] dump_stack+0x45/0x57
[  555.325757]  [<ffffffff8109c66a>] warn_slowpath_common+0x8a/0xc0
[  555.325759]  [<ffffffff8109c6f5>] warn_slowpath_fmt+0x55/0x70
[  555.325763]  [<ffffffff813ba270>] __list_add+0xa0/0xd0
[  555.325768]  [<ffffffff81102d1d>] __internal_add_timer+0x9d/0x110
[  555.325771]  [<ffffffff81102dbf>] internal_add_timer+0x2f/0xc0
[  555.325774]  [<ffffffff81104e5a>] mod_timer+0x12a/0x230
[  555.325782]  [<ffffffffa03d54ca>] fm10k_probe+0x69a/0xc80 [fm10k]
[  555.325787]  [<ffffffff813e8355>] local_pci_probe+0x45/0xa0
[  555.325791]  [<ffffffff8129cf42>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0
[  555.325794]  [<ffffffff813e96b9>] pci_device_probe+0xf9/0x150
[  555.325799]  [<ffffffff814d7e73>] driver_probe_device+0xa3/0x400
[  555.325802]  [<ffffffff814d82ab>] __driver_attach+0x9b/0xa0
[  555.325805]  [<ffffffff814d8210>] ? __device_attach+0x40/0x40
[  555.325808]  [<ffffffff814d5bd3>] bus_for_each_dev+0x73/0xc0
[  555.325811]  [<ffffffff814d78ce>] driver_attach+0x1e/0x20
[  555.325815]  [<ffffffff814d7480>] bus_add_driver+0x180/0x250
[  555.325819]  [<ffffffffa03b2000>] ? 0xffffffffa03b2000
[  555.325823]  [<ffffffff814d8aa4>] driver_register+0x64/0xf0
[  555.325826]  [<ffffffff813e7bec>] __pci_register_driver+0x4c/0x50
[  555.325832]  [<ffffffffa03d6ca3>] fm10k_register_pci_driver+0x23/0x30 [fm10k]
[  555.325838]  [<ffffffffa03b2080>] fm10k_init_module+0x80/0x1000 [fm10k]
[  555.325843]  [<ffffffff81002128>] do_one_initcall+0xb8/0x200
[  555.325848]  [<ffffffff811e10d2>] ? __vunmap+0xa2/0x100
[  555.325852]  [<ffffffff811fe239>] ? kmem_cache_alloc_trace+0x1b9/0x240
[  555.325855]  [<ffffffff8178230e>] ? do_init_module+0x28/0x1cb
[  555.325858]  [<ffffffff81782346>] do_init_module+0x60/0x1cb
[  555.325862]  [<ffffffff8112168e>] load_module+0x205e/0x26b0
[  555.325866]  [<ffffffff8111d110>] ? store_uevent+0x70/0x70
[  555.325870]  [<ffffffff812234b0>] ? kernel_read+0x50/0x80
[  555.325873]  [<ffffffff81121f3e>] SyS_finit_module+0xbe/0xf0
[  555.325878]  [<ffffffff81789749>] system_call_fastpath+0x12/0x17
[  555.325880] ---[ end trace 9e0f58d071eafd2a ]---

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-04-05 12:49:02 -07:00
..
fm10k_common.c fm10k: Resolve various spelling errors and checkpatch warnings 2015-03-03 01:07:23 -08:00
fm10k_common.h fm10k: Add support for ptp to hw specific files 2014-09-23 03:59:22 -07:00
fm10k_dcbnl.c fm10k: conditionally compile DCB and DebugFS support 2015-12-13 15:27:59 -08:00
fm10k_debugfs.c fm10k: conditionally compile DCB and DebugFS support 2015-12-13 15:27:59 -08:00
fm10k_ethtool.c fm10k: cleanup remaining right-bit-shifted 1 2016-04-05 12:44:30 -07:00
fm10k_iov.c fm10k: add support for extra debug statistics 2015-09-22 15:58:27 -07:00
fm10k_main.c fm10k: cleanup remaining right-bit-shifted 1 2016-04-05 12:44:30 -07:00
fm10k_mbx.c fm10k: cleanup mailbox code comments etc 2015-12-22 05:19:57 -08:00
fm10k_mbx.h fm10k: cleanup mailbox code comments etc 2015-12-22 05:19:57 -08:00
fm10k_netdev.c fm10k: cleanup remaining right-bit-shifted 1 2016-04-05 12:44:30 -07:00
fm10k_pci.c fm10k: don't initialize service task until later in probe 2016-04-05 12:49:02 -07:00
fm10k_pf.c fm10k: cleanup remaining right-bit-shifted 1 2016-04-05 12:44:30 -07:00
fm10k_pf.h fm10k: constify fm10k_mac_ops, fm10k_iov_ops and fm10k_info structures 2015-12-22 05:10:01 -08:00
fm10k_ptp.c fm10k: cleanup SPACE_BEFORE_TAB checkpatch warning 2016-04-05 12:44:44 -07:00
fm10k_tlv.c fm10k: cleanup remaining right-bit-shifted 1 2016-04-05 12:44:30 -07:00
fm10k_tlv.h fm10k: cleanup mailbox code comments etc 2015-12-22 05:19:57 -08:00
fm10k_type.h fm10k: cleanup remaining right-bit-shifted 1 2016-04-05 12:44:30 -07:00
fm10k_vf.c fm10k: constify fm10k_mac_ops, fm10k_iov_ops and fm10k_info structures 2015-12-22 05:10:01 -08:00
fm10k_vf.h fm10k: constify fm10k_mac_ops, fm10k_iov_ops and fm10k_info structures 2015-12-22 05:10:01 -08:00
fm10k.h fm10k: cleanup remaining right-bit-shifted 1 2016-04-05 12:44:30 -07:00
Makefile fm10k: conditionally compile DCB and DebugFS support 2015-12-13 15:27:59 -08:00