Commit 6fac5e5b authored by Changpeng Liu's avatar Changpeng Liu Committed by Jim Harris
Browse files

bdev/nvme: detect Controller Fatal Status when timeout happens



If the controller has a serious error and set the Controller
Fatal Status field to 1, host driver does not know this error,
while here, when timeout happens, try to detect the CFS and
reset the controller to recover from such fatal status.

Change-Id: I9fa5b263b34edc52d0f359d874b2920f7570d1f3
Signed-off-by: default avatarChangpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/417622


Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: default avatarSPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: default avatarBen Walker <benjamin.walker@intel.com>
Reviewed-by: default avatarShuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: default avatarJim Harris <james.r.harris@intel.com>
parent 23ef5c44
Loading
Loading
Loading
Loading
+4 −1
Original line number Diff line number Diff line
@@ -669,7 +669,10 @@ struct spdk_nvme_qpair;
 * request.
 *
 * For timeouts detected on the admin queue pair, the qpair returned here will
 * be NULL.
 * be NULL.  If the controller has a serious error condition and is unable to
 * communicate with driver via completion queue, the controller can set Controller
 * Fatal Status field to 1, then reset is required to recover from such error.
 * Users may detect Controller Fatal Status when timeout happens.
 *
 * \param cb_arg Argument passed to callback funciton.
 * \param ctrlr Opaque handle to NVMe controller.
+11 −0
Original line number Diff line number Diff line
@@ -851,9 +851,20 @@ timeout_cb(void *cb_arg, struct spdk_nvme_ctrlr *ctrlr,
	   struct spdk_nvme_qpair *qpair, uint16_t cid)
{
	int rc;
	union spdk_nvme_csts_register csts;

	SPDK_WARNLOG("Warning: Detected a timeout. ctrlr=%p qpair=%p cid=%u\n", ctrlr, qpair, cid);

	csts = spdk_nvme_ctrlr_get_regs_csts(ctrlr);
	if (csts.bits.cfs) {
		SPDK_ERRLOG("Controller Fatal Status, reset required\n");
		rc = spdk_nvme_ctrlr_reset(ctrlr);
		if (rc) {
			SPDK_ERRLOG("Resetting controller failed.\n");
		}
		return;
	}

	switch (g_opts.action_on_timeout) {
	case SPDK_BDEV_NVME_TIMEOUT_ACTION_ABORT:
		if (qpair) {