Commit 3fcee8dd authored by Vasuki Manikarnike's avatar Vasuki Manikarnike Committed by Tomasz Zawadzki
Browse files

lib/nvme: Do not submit queued aborts if adminq is in failed state.



With RDMA, the admin poller can experience a remote disconnect when
processing completions. The admin qpair will be disconnected to handle
this. The disconnect code path will manually complete queued aborts.
However, the completion callback for the abort will attempt to resubmit
other queued aborts from the queue, which will result in a very large
stack and can eventually cause a segfault.
The fix is to not resubmit queued aborts if the admin qpair is in any
kind of failed state.

Change-Id: I4a6f959232c8a1bd30c87ca50459014e556cbaa0
Signed-off-by: default avatarVasuki Manikarnike <vasuki.manikarnike@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/15114


Community-CI: Mellanox Build Bot
Tested-by: default avatarSPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: default avatarShuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: default avatarAleksey Marchuk <alexeymar@nvidia.com>
Reviewed-by: default avatarMichael Haeuptle <michaelhaeuptle@gmail.com>
parent 71828b74
Loading
Loading
Loading
Loading
+7 −0
Original line number Diff line number Diff line
@@ -4,6 +4,7 @@
 */

#include "nvme_internal.h"
#include "spdk/nvme.h"

int
spdk_nvme_ctrlr_io_cmd_raw_no_payload_build(struct spdk_nvme_ctrlr *ctrlr,
@@ -546,6 +547,12 @@ nvme_ctrlr_retry_queued_abort(struct spdk_nvme_ctrlr *ctrlr)
	int rc;

	if (ctrlr->is_resetting || ctrlr->is_destructed || ctrlr->is_failed) {
		/* Don't resubmit aborts if ctrlr is failing */
		return;
	}

	if (spdk_nvme_ctrlr_get_admin_qp_failure_reason(ctrlr) != SPDK_NVME_QPAIR_FAILURE_NONE) {
		/* Don't resubmit aborts if admin qpair is failed */
		return;
	}

+3 −0
Original line number Diff line number Diff line
@@ -51,6 +51,9 @@ DEFINE_STUB(nvme_transport_qpair_iterate_requests, int,
DEFINE_STUB(nvme_qpair_abort_queued_reqs_with_cbarg, uint32_t,
	    (struct spdk_nvme_qpair *qpair, void *cmd_cb_arg), 0);

DEFINE_STUB(spdk_nvme_ctrlr_get_admin_qp_failure_reason, spdk_nvme_qp_failure_reason,
	    (struct spdk_nvme_ctrlr *ctrlr), 0);

static int
nvme_ns_cmp(struct spdk_nvme_ns *ns1, struct spdk_nvme_ns *ns2)
{