+5
−0
Loading
During a reset of a fabrics controller, we disconnect the controller and then we attempt to reconnect. The disconnection will unset the `is_failed` flag in the controller and the adminq is disconnected. If something bad happens in the meantime, the controller may be marked as `is_failed` again. Then when the reconnection start, we reset the adminq's state to CONNECTING and we start polling on it until we get a response to our connection request. But since the controller is marked as is_failed and the adminq's state is CONNECTING, the polling will not do anything (cf `spdk_nvme_qpair_process_completions()`). Moreover, the controller is in a WAIT_FOR_CONNECT_ADMINQ state with an infinite timeout. So the controller may be blocked forever. Let's try to prevent this situation by checking the `is_failed` flag before attempting a reconnection. Change-Id: Id83ff161e0b389fa2e266468006f619ad6bc65c1 Signed-off-by:Alex Michon <amichon@kalrayinc.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/24649 Tested-by:
SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by:
Jim Harris <jim.harris@samsung.com> Community-CI: Mellanox Build Bot Reviewed-by:
Shuhei Matsumoto <smatsumoto@nvidia.com>