linux_dsm_epyc7002/arch/sparc
Babu Moger d0c31e0200 sparc/PCI: Fix for panic while enabling SR-IOV
We noticed this panic while enabling SR-IOV in sparc.

mlx4_core: Mellanox ConnectX core driver v2.2-1 (Jan  1 2015)
mlx4_core: Initializing 0007:01:00.0
mlx4_core 0007:01:00.0: Enabling SR-IOV with 5 VFs
mlx4_core: Initializing 0007:01:00.1
Unable to handle kernel NULL pointer dereference
insmod(10010): Oops [#1]
CPU: 391 PID: 10010 Comm: insmod Not tainted
		4.1.12-32.el6uek.kdump2.sparc64 #1
TPC: <dma_supported+0x20/0x80>
I7: <__mlx4_init_one+0x324/0x500 [mlx4_core]>
Call Trace:
 [00000000104c5ea4] __mlx4_init_one+0x324/0x500 [mlx4_core]
 [00000000104c613c] mlx4_init_one+0xbc/0x120 [mlx4_core]
 [0000000000725f14] local_pci_probe+0x34/0xa0
 [0000000000726028] pci_call_probe+0xa8/0xe0
 [0000000000726310] pci_device_probe+0x50/0x80
 [000000000079f700] really_probe+0x140/0x420
 [000000000079fa24] driver_probe_device+0x44/0xa0
 [000000000079fb5c] __device_attach+0x3c/0x60
 [000000000079d85c] bus_for_each_drv+0x5c/0xa0
 [000000000079f588] device_attach+0x88/0xc0
 [000000000071acd0] pci_bus_add_device+0x30/0x80
 [0000000000736090] virtfn_add.clone.1+0x210/0x360
 [00000000007364a4] sriov_enable+0x2c4/0x520
 [000000000073672c] pci_enable_sriov+0x2c/0x40
 [00000000104c2d58] mlx4_enable_sriov+0xf8/0x180 [mlx4_core]
 [00000000104c49ac] mlx4_load_one+0x42c/0xd40 [mlx4_core]
Disabling lock debugging due to kernel taint
Caller[00000000104c5ea4]: __mlx4_init_one+0x324/0x500 [mlx4_core]
Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core]
Caller[0000000000725f14]: local_pci_probe+0x34/0xa0
Caller[0000000000726028]: pci_call_probe+0xa8/0xe0
Caller[0000000000726310]: pci_device_probe+0x50/0x80
Caller[000000000079f700]: really_probe+0x140/0x420
Caller[000000000079fa24]: driver_probe_device+0x44/0xa0
Caller[000000000079fb5c]: __device_attach+0x3c/0x60
Caller[000000000079d85c]: bus_for_each_drv+0x5c/0xa0
Caller[000000000079f588]: device_attach+0x88/0xc0
Caller[000000000071acd0]: pci_bus_add_device+0x30/0x80
Caller[0000000000736090]: virtfn_add.clone.1+0x210/0x360
Caller[00000000007364a4]: sriov_enable+0x2c4/0x520
Caller[000000000073672c]: pci_enable_sriov+0x2c/0x40
Caller[00000000104c2d58]: mlx4_enable_sriov+0xf8/0x180 [mlx4_core]
Caller[00000000104c49ac]: mlx4_load_one+0x42c/0xd40 [mlx4_core]
Caller[00000000104c5f90]: __mlx4_init_one+0x410/0x500 [mlx4_core]
Caller[00000000104c613c]: mlx4_init_one+0xbc/0x120 [mlx4_core]
Caller[0000000000725f14]: local_pci_probe+0x34/0xa0
Caller[0000000000726028]: pci_call_probe+0xa8/0xe0
Caller[0000000000726310]: pci_device_probe+0x50/0x80
Caller[000000000079f700]: really_probe+0x140/0x420
Caller[000000000079fa24]: driver_probe_device+0x44/0xa0
Caller[000000000079fb08]: __driver_attach+0x88/0xa0
Caller[000000000079d90c]: bus_for_each_dev+0x6c/0xa0
Caller[000000000079f29c]: driver_attach+0x1c/0x40
Caller[000000000079e35c]: bus_add_driver+0x17c/0x220
Caller[00000000007a02d4]: driver_register+0x74/0x120
Caller[00000000007263fc]: __pci_register_driver+0x3c/0x60
Caller[00000000104f62bc]: mlx4_init+0x60/0xcc [mlx4_core]
Kernel panic - not syncing: Fatal exception
Press Stop-A (L1-A) to return to the boot prom
---[ end Kernel panic - not syncing: Fatal exception

Details:
Here is the call sequence
virtfn_add->__mlx4_init_one->dma_set_mask->dma_supported

The panic happened at line 760(file arch/sparc/kernel/iommu.c)

758 int dma_supported(struct device *dev, u64 device_mask)
759 {
760         struct iommu *iommu = dev->archdata.iommu;
761         u64 dma_addr_mask = iommu->dma_addr_mask;
762
763         if (device_mask >= (1UL << 32UL))
764                 return 0;
765
766         if ((device_mask & dma_addr_mask) == dma_addr_mask)
767                 return 1;
768
769 #ifdef CONFIG_PCI
770         if (dev_is_pci(dev))
771		return pci64_dma_supported(to_pci_dev(dev), device_mask);
772 #endif
773
774         return 0;
775 }
776 EXPORT_SYMBOL(dma_supported);

Same panic happened with Intel ixgbe driver also.

SR-IOV code looks for arch specific data while enabling
VFs. When VF device is added, driver probe function makes set
of calls to initialize the pci device. Because the VF device is
added different way than the normal PF device(which happens via
of_create_pci_dev for sparc), some of the arch specific initialization
does not happen for VF device.  That causes panic when archdata is
accessed.

To fix this, I have used already defined weak function
pcibios_setup_device to copy archdata from PF to VF.
Also verified the fix.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-29 20:50:37 -04:00
..
boot sparc: Add "install" target 2014-08-04 20:45:59 -07:00
configs ldmvsw: Add ldmvsw.c driver code 2016-03-18 19:33:00 -04:00
crypto crypto: sparc - initialize blkcipher.ivsize 2015-10-08 21:36:48 +08:00
include Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2016-03-28 15:14:11 -05:00
kernel sparc/PCI: Fix for panic while enabling SR-IOV 2016-03-29 20:50:37 -04:00
lib sparc64: fix FP corruption in user copy functions 2015-12-24 12:13:18 -05:00
math-emu arch/sparc/math-emu/math_32.c: drop stray break operator 2014-08-04 20:29:06 -07:00
mm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2016-03-28 15:14:11 -05:00
net sparc: Fix misspellings in comments. 2016-03-20 21:28:58 -07:00
oprofile sparc: using HZ needs an include of linux/param.h 2009-10-05 00:46:08 -07:00
power Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2014-10-11 20:36:34 -04:00
prom sparc64: Fix register corruption in top-most kernel stack frame during boot. 2014-10-24 09:52:49 -07:00
Kbuild sparc64: Add SHA1 driver making use of the 'sha1' instruction. 2012-08-20 15:08:49 -07:00
Kconfig dma-mapping: always provide the dma_map_ops based implementation 2016-01-20 17:09:18 -08:00
Kconfig.debug lib: consolidate DEBUG_STACK_USAGE option 2011-05-25 08:39:54 -07:00
Makefile sparc32: Add -Wa,-Av8 to KBUILD_CFLAGS. 2016-03-01 00:24:04 -05:00