mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-21 16:36:47 +07:00
habanalabs: flush EQ workers in hard reset
During hard-reset, there can be multiple events received from the H/W. For each event, the driver opens a worker thread to handle it. For some of the events, the driver will read/write registers in the code that handles the event. In case of hard-reset, we must prevent reads/writes to the registers during the reset operation because the device might get stuck if that happens. Therefore, flush the EQ workers before resetting the device (in hard-reset only). Additional events won't arrive as we synced and disabled the interrupts. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
This commit is contained in:
parent
1af69d30c4
commit
55f6d68097
@ -887,13 +887,19 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset,
|
|||||||
/* Go over all the queues, release all CS and their jobs */
|
/* Go over all the queues, release all CS and their jobs */
|
||||||
hl_cs_rollback_all(hdev);
|
hl_cs_rollback_all(hdev);
|
||||||
|
|
||||||
/* Kill processes here after CS rollback. This is because the process
|
if (hard_reset) {
|
||||||
* can't really exit until all its CSs are done, which is what we
|
/* Kill processes here after CS rollback. This is because the
|
||||||
* do in cs rollback
|
* process can't really exit until all its CSs are done, which
|
||||||
*/
|
* is what we do in cs rollback
|
||||||
if (hard_reset)
|
*/
|
||||||
device_kill_open_processes(hdev);
|
device_kill_open_processes(hdev);
|
||||||
|
|
||||||
|
/* Flush the Event queue workers to make sure no other thread is
|
||||||
|
* reading or writing to registers during the reset
|
||||||
|
*/
|
||||||
|
flush_workqueue(hdev->eq_wq);
|
||||||
|
}
|
||||||
|
|
||||||
/* Release kernel context */
|
/* Release kernel context */
|
||||||
if ((hard_reset) && (hl_ctx_put(hdev->kernel_ctx) == 1))
|
if ((hard_reset) && (hl_ctx_put(hdev->kernel_ctx) == 1))
|
||||||
hdev->kernel_ctx = NULL;
|
hdev->kernel_ctx = NULL;
|
||||||
|
Loading…
Reference in New Issue
Block a user