linux_dsm_epyc7002/drivers/infiniband/sw
Mikhail Malygin efc365e729 IB/rxe: Fix for oops in rxe_register_device on ppc64le arch
On ppc64le arch rxe_add command causes oops in kernel log:

[   92.495140] Oops: Kernel access of bad area, sig: 11 [#1]
[   92.499710] SMP NR_CPUS=2048 NUMA pSeries
[   92.499792] Modules linked in: ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) nf_conntrack_netlink(E) nfnetlink(E) xfrm_user(E) iptable
_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) xt_addrtype(E) iptable_filter(E) ip_tables(E) xt_conntrack(E) x_tables(E)
 nf_nat(E) nf_conntrack(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) af_packet(E) rpcrdma(E) ib_isert(E) iscsi_target_mod(E) i
b_iser(E) libiscsi(E) ib_srpt(E) target_core_mod(E) ib_srp(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) bochs_drm(E) tt
m(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) drm(E) agpgart(E) virtio_rng(E) virtio_console(E) rtc_
generic(E) dm_ec(OEN) ttln_rdma(OEN) rdma_cm(E) configfs(E) iw_cm(E) ib_cm(E) rdma_rxe(E) ip6_udp_tunnel(E) udp_tunnel(E) ib_core(E) ql
a2xxx(E)
[   92.499832]  scsi_transport_fc(E) nvme_fc(E) nvme_fabrics(E) nvme_core(E) ipmi_watchdog(E) ipmi_ssif(E) ipmi_poweroff(E) ipmi_powernv(EX) ipmi_devintf(E) ipmi_msghandler(E) dummy(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_service_time(E) scsi_transport_iscsi(E) sd_mod(E) sr_mod(E) cdrom(E) hid_generic(E) usbhid(E) virtio_blk(E) virtio_scsi(E) virtio_net(E) ibmvscsi(EX) scsi_transport_srp(E) xhci_pci(E) xhci_hcd(E) usbcore(E) usb_common(E) virtio_pci(E) virtio_ring(E) virtio(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) autofs4(E)
[   92.499834] Supported: No, Unsupported modules are loaded
[   92.499839] CPU: 3 PID: 5576 Comm: sh Tainted: G           OE   NX 4.4.120-ttln.17-default #1
[   92.499841] task: c0000000afe8a490 ti: c0000000beba8000 task.ti: c0000000beba8000
[   92.499842] NIP: c00000000008ba3c LR: c000000000027644 CTR: c00000000008ba10
[   92.499844] REGS: c0000000bebab750 TRAP: 0300   Tainted: G           OE   NX  (4.4.120-ttln.17-default)
[   92.499850] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28424428  XER: 20000000
[   92.499871] CFAR: 0000000000002424 DAR: 0000000000000208 DSISR: 40000000 SOFTE: 1
               GPR00: c000000000027644 c0000000bebab9d0 c000000000f09700 0000000000000000
               GPR04: d0000000043d7192 0000000000000002 000000000000001a fffffffffffffffe
               GPR08: 000000000000009c c00000000008ba10 d0000000043e5848 d0000000043d3828
               GPR12: c00000000008ba10 c000000007a02400 0000000010062e38 0000010020388860
               GPR16: 0000000000000000 0000000000000000 00000100203885f0 00000000100f6c98
               GPR20: c0000000b3f1fcc0 c0000000b3f1fc48 c0000000b3f1fbd0 c0000000b3f1fb58
               GPR24: c0000000b3f1fae0 c0000000b3f1fa68 00000000000005dc c0000000b3f1f9f0
               GPR28: d0000000043e5848 c0000000b3f1f900 c0000000b3f1f320 c0000000b3f1f000
[   92.499881] NIP [c00000000008ba3c] dma_get_required_mask_pSeriesLP+0x2c/0x1a0
[   92.499885] LR [c000000000027644] dma_get_required_mask+0x44/0xac
[   92.499886] Call Trace:
[   92.499891] [c0000000bebab9d0] [c0000000bebaba30] 0xc0000000bebaba30 (unreliable)
[   92.499894] [c0000000bebaba10] [c000000000027644] dma_get_required_mask+0x44/0xac
[   92.499904] [c0000000bebaba30] [d0000000043cb4b4] rxe_register_device+0xc4/0x430 [rdma_rxe]
[   92.499910] [c0000000bebabab0] [d0000000043c06c8] rxe_add+0x448/0x4e0 [rdma_rxe]
[   92.499915] [c0000000bebabb30] [d0000000043d28dc] rxe_net_add+0x4c/0xf0 [rdma_rxe]
[   92.499921] [c0000000bebabb60] [d0000000043d305c] rxe_param_set_add+0x6c/0x1ac [rdma_rxe]
[   92.499924] [c0000000bebabbf0] [c0000000000e78c0] param_attr_store+0xa0/0x180
[   92.499927] [c0000000bebabc70] [c0000000000e6448] module_attr_store+0x48/0x70
[   92.499932] [c0000000bebabc90] [c000000000391f60] sysfs_kf_write+0x70/0xb0
[   92.499935] [c0000000bebabcb0] [c000000000390f1c] kernfs_fop_write+0x18c/0x1e0
[   92.499939] [c0000000bebabd00] [c0000000002e22ac] __vfs_write+0x4c/0x1d0
[   92.499942] [c0000000bebabd90] [c0000000002e2f94] vfs_write+0xc4/0x200
[   92.499945] [c0000000bebabde0] [c0000000002e488c] SyS_write+0x6c/0x110
[   92.499948] [c0000000bebabe30] [c000000000009384] system_call+0x38/0xe4
[   92.499949] Instruction dump:
[   92.499954] 4e800020 3c4c00e8 3842dcf0 7c0802a6 f8010010 60000000 7c0802a6 fba1ffe8
[   92.499958] fbc1fff0 fbe1fff8 f8010010 f821ffc1 <e9230208> 7c7e1b78 2fa90000 419e0078
[   92.499962] ---[ end trace bed077e15eb420cf ]---

It fails in dma_get_required_mask, that has ppc-specific implementation,
and fail if provided device argument is NULL

Signed-off-by: Mikhail Malygin <mikhail@malygin.me>
Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-05 13:04:50 -06:00
..
rdmavt IB/uverbs: Extend uverbs_ioctl header with driver_id 2018-03-19 14:45:17 -06:00
rxe IB/rxe: Fix for oops in rxe_register_device on ppc64le arch 2018-04-05 13:04:50 -06:00
Makefile