Go to file
Wei Hu (Xavier) d061effc36 RDMA/hns: Fix the Oops during rmmod or insmod ko when reset occurs
In the reset process, the hns3 NIC driver notifies the RoCE driver to
perform reset related processing by calling the .reset_notify() interface
registered by the RoCE driver in hip08 SoC.

In the current version, if a reset occurs simultaneously during the
execution of rmmod or insmod ko, there may be Oops error as below:

 Internal error: Oops: 86000007 [#1] PREEMPT SMP
 Modules linked in: hns_roce(O) hns3(O) hclge(O) hnae3(O) [last unloaded: hns_roce_hw_v2]
 CPU: 0 PID: 14 Comm: kworker/0:1 Tainted: G           O      4.19.0-ge00d540 #1
 Hardware name: Huawei Technologies Co., Ltd.
 Workqueue: events hclge_reset_service_task [hclge]
 pstate: 60c00009 (nZCv daif +PAN +UAO)
 pc : 0xffff00000100b0b8
 lr : 0xffff00000100aea0
 sp : ffff000009afbab0
 x29: ffff000009afbab0 x28: 0000000000000800
 x27: 0000000000007ff0 x26: ffff80002f90c004
 x25: 00000000000007ff x24: ffff000008f97000
 x23: ffff80003efee0a8 x22: 0000000000001000
 x21: ffff80002f917ff0 x20: ffff8000286ea070
 x19: 0000000000000800 x18: 0000000000000400
 x17: 00000000c4d3225d x16: 00000000000021b8
 x15: 0000000000000400 x14: 0000000000000400
 x13: 0000000000000000 x12: ffff80003fac6e30
 x11: 0000800036303000 x10: 0000000000000001
 x9 : 0000000000000000 x8 : ffff80003016d000
 x7 : 0000000000000000 x6 : 000000000000003f
 x5 : 0000000000000040 x4 : 0000000000000000
 x3 : 0000000000000004 x2 : 00000000000007ff
 x1 : 0000000000000000 x0 : 0000000000000000
 Process kworker/0:1 (pid: 14, stack limit = 0x00000000af8f0ad9)
 Call trace:
  0xffff00000100b0b8
  0xffff00000100b3a0
  hns_roce_init+0x624/0xc88 [hns_roce]
  0xffff000001002df8
  0xffff000001006960
  hclge_notify_roce_client+0x74/0xe0 [hclge]
  hclge_reset_service_task+0xa58/0xbc0 [hclge]
  process_one_work+0x1e4/0x458
  worker_thread+0x40/0x450
  kthread+0x12c/0x130
  ret_from_fork+0x10/0x18
 Code: bad PC value

In the reset process, we will release the resources firstly, and after the
hardware reset is completed, we will reapply resources and reconfigure the
hardware.

We can solve this problem by modifying both the NIC and the RoCE
driver. We can modify the concurrent processing in the NIC driver to avoid
calling the .reset_notify and .uninit_instance ops at the same time. And
we need to modify the RoCE driver to record the reset stage and the
driver's init/uninit state, and check the state in the .reset_notify,
.init_instance. and uninit_instance functions to avoid NULL pointer
operation.

Fixes: cb7a94c9c8 ("RDMA/hns: Add reset process for RoCE in hip08")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:13:50 -07:00
arch Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-02-03 09:08:12 -08:00
block blk-mq: fix a hung issue when fsync 2019-01-30 08:53:54 -07:00
certs kbuild: remove redundant target cleaning on failure 2019-01-06 09:46:51 +09:00
crypto crypto: sm3 - fix undefined shift by >= width of value 2019-01-10 21:37:32 +08:00
Documentation Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-02-03 09:08:12 -08:00
drivers RDMA/hns: Fix the Oops during rmmod or insmod ko when reset occurs 2019-02-04 16:13:50 -07:00
firmware kbuild: change filechk to surround the given command with { } 2019-01-06 09:46:51 +09:00
fs for-5.0-rc4-tag 2019-02-03 08:48:33 -08:00
include RDMA/rxe: Improve loopback marking 2019-02-04 15:57:49 -07:00
init psi: clarify the Kconfig text for the default-disable option 2019-02-01 15:46:24 -08:00
ipc ipc: IPCMNI limit check for semmni 2018-10-31 08:54:14 -07:00
kernel Linux 5.0-rc5 2019-02-04 14:53:42 -07:00
lib lib/test_kmod.c: potential double free in error handling 2019-02-01 15:46:23 -08:00
LICENSES This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
mm mm: migrate: don't rely on __PageMovable() of newpage after unlocking it 2019-02-01 15:46:24 -08:00
net Linux 5.0-rc5 2019-02-04 14:53:42 -07:00
samples samples/bpf: workaround clang asm goto compilation errors 2019-01-15 20:57:30 +01:00
scripts Bug fixes for gcc-plugins 2019-01-21 13:07:03 +13:00
security apparmor: Fix aa_label_build() error handling for failed merges 2019-02-01 08:01:39 -08:00
sound sound fixes for 5.0-rc5 2019-01-31 10:00:00 -08:00
tools Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-02-03 08:59:51 -08:00
usr user/Makefile: Fix typo and capitalization in comment section 2018-12-11 00:18:03 +09:00
virt KVM: validate userspace input in kvm_clear_dirty_log_protect() 2019-01-11 18:38:07 +01:00
.clang-format clang-format: Update .clang-format with the latest for_each macro list 2019-01-19 19:26:06 +01:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Add hch to .get_maintainer.ignore 2015-08-21 14:30:10 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore kbuild: Add support for DT binding schema checks 2018-12-13 09:41:32 -06:00
.mailmap A few early MIPS fixes for 4.21: 2019-01-05 12:48:25 -08:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS Add CREDITS entry for Shaohua Li 2019-01-04 14:27:09 -07:00
Kbuild kbuild: use assignment instead of define ... endef for filechk_* rules 2019-01-06 10:22:35 +09:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-02-03 09:08:12 -08:00
Makefile Linux 5.0-rc5 2019-02-03 13:48:04 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.