Commit 86559513 authored by Shuhei Matsumoto's avatar Shuhei Matsumoto Committed by Jim Harris
Browse files

bdev/nvme: Clear nvme_ctrlr->reset_start_tsc only if reset succeeded



nvme_ctrlr->reset_start_tsc remembers the time at reset started
failing.

However, bdev_nvme_failover_ctrlr() can be called by any reason when
reconnect has been failing. bdev_nvme_check_op_after_reset() cleared it
and bdev_nvme_failover_ctrlr() updated it to the current time.

If bdev_nvme_failover_ctrlr() and reconnect failure were repeated,
ctrlr_loss_timeout was never expired.

Fix the bug as follows:
- Clear nvme_ctrlr->reset_start_tsc only if reset succeeded.
- reset/failover updates nvme_ctrlr->reset_start_tsc only if it is zero.

This fix will be helpful to prevent confusion. It was confusing that
a check function changed states. check function does not change
reset_start_tsc anymore.

Change-Id: I206b2ca756c4cc4f074eb44a6e2d21e08641822c
Signed-off-by: default avatarShuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/25863


Reviewed-by: default avatarJacek Kalwas <jacek.kalwas@nutanix.com>
Reviewed-by: default avatarBen Walker <ben@nvidia.com>
Community-CI: Mellanox Build Bot
Reviewed-by: default avatarJim Harris <jim.harris@nvidia.com>
Tested-by: default avatarSPDK Automated Test System <spdkbot@gmail.com>
parent fbfe86e6
Loading
Loading
Loading
Loading
+7 −6
Original line number Diff line number Diff line
@@ -2179,10 +2179,8 @@ bdev_nvme_check_op_after_reset(struct nvme_ctrlr *nvme_ctrlr, bool success,
		/* Complete pending destruct after reset completes. */
		return OP_COMPLETE_PENDING_DESTRUCT;
	} else if (pending_failover) {
		nvme_ctrlr->reset_start_tsc = 0;
		return OP_FAILOVER;
	} else if (success || nvme_ctrlr->opts.reconnect_delay_sec == 0) {
		nvme_ctrlr->reset_start_tsc = 0;
		return OP_NONE;
	} else if (bdev_nvme_check_ctrlr_loss_timeout(nvme_ctrlr)) {
		return OP_DESTRUCT;
@@ -2297,6 +2295,7 @@ bdev_nvme_reset_ctrlr_complete(struct nvme_ctrlr *nvme_ctrlr, bool success)
		}
	} else {
		NVME_CTRLR_NOTICELOG(nvme_ctrlr, "Resetting controller successful.\n");
		nvme_ctrlr->reset_start_tsc = 0;
	}

	nvme_ctrlr->resetting = false;
@@ -2623,10 +2622,11 @@ bdev_nvme_reset_ctrlr_unsafe(struct nvme_ctrlr *nvme_ctrlr, spdk_msg_fn *msg_fn)
		nvme_ctrlr->reconnect_is_delayed = false;
	} else {
		*msg_fn = _bdev_nvme_reset_ctrlr;
		assert(nvme_ctrlr->reset_start_tsc == 0);
	}

	if (nvme_ctrlr->reset_start_tsc == 0) {
		nvme_ctrlr->reset_start_tsc = spdk_get_ticks();
	}

	return 0;
}
@@ -3207,8 +3207,9 @@ bdev_nvme_failover_ctrlr_unsafe(struct nvme_ctrlr *nvme_ctrlr, bool remove)
	nvme_ctrlr->resetting = true;
	nvme_ctrlr->in_failover = true;

	assert(nvme_ctrlr->reset_start_tsc == 0);
	if (nvme_ctrlr->reset_start_tsc == 0) {
		nvme_ctrlr->reset_start_tsc = spdk_get_ticks();
	}

	return 0;
}