linux_dsm_epyc7002/drivers/nvme/host
James Smart 86880d6461 nvme: validate controller state before rescheduling keep alive
Delete operations are seeing NULL pointer references in call_timer_fn.
Tracking these back, the timer appears to be the keep alive timer.

nvme_keep_alive_work() which is tied to the timer that is cancelled
by nvme_stop_keep_alive(), simply starts the keep alive io but doesn't
wait for it's completion. So nvme_stop_keep_alive() only stops a timer
when it's pending. When a keep alive is in flight, there is no timer
running and the nvme_stop_keep_alive() will have no affect on the keep
alive io. Thus, if the io completes successfully, the keep alive timer
will be rescheduled.   In the failure case, delete is called, the
controller state is changed, the nvme_stop_keep_alive() is called while
the io is outstanding, and the delete path continues on. The keep
alive happens to successfully complete before the delete paths mark it
as aborted as part of the queue termination, so the timer is restarted.
The delete paths then tear down the controller, and later on the timer
code fires and the timer entry is now corrupt.

Fix by validating the controller state before rescheduling the keep
alive. Testing with the fix has confirmed the condition above was hit.

Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-07 07:11:11 -08:00
..
core.c nvme: validate controller state before rescheduling keep alive 2018-12-07 07:11:11 -08:00
fabrics.c nvme-fabrics: move controller options matching to fabrics 2018-10-19 14:22:24 +02:00
fabrics.h nvme-fabrics: move controller options matching to fabrics 2018-10-19 14:22:24 +02:00
fault_inject.c nvme: Add fault injection feature 2018-03-26 08:53:43 -06:00
fc.c nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request() 2018-11-27 18:12:08 +01:00
Kconfig IB: Revert "remove redundant INFINIBAND kconfig dependencies" 2018-05-28 10:40:16 -06:00
lightnvm.c lightnvm: do no update csecs and sos on 1.2 2018-10-09 08:25:08 -06:00
Makefile nvme: Add fault injection feature 2018-03-26 08:53:43 -06:00
multipath.c nvme: make sure ns head inherits underlying device limits 2018-11-09 06:14:47 -07:00
nvme.h nvme: warn when finding multi-port subsystems without multipathing enabled 2018-11-30 17:23:22 +01:00
pci.c nvme-pci: fix conflicting p2p resource adds 2018-11-02 08:14:46 -06:00
rdma.c nvme-rdma: fix double freeing of async event data 2018-11-30 17:23:23 +01:00
trace.c nvme: add disk name to trace events 2018-07-24 15:55:48 +02:00
trace.h nvme-core: add async event trace helper 2018-10-01 14:16:12 -07:00