Commit 792807a4 authored by Jim Harris's avatar Jim Harris
Browse files

nvme: fix infinite loop when aborting queued reqs



When we disconnect a qpair, part of the code path is
calling _nvme_qpair_abort_queued_reqs.  This takes
care of aborting any requests that were queued waiting
for slots to open on the submission queue.
It walks the STAILQ one by one and manually completes
them with ABORT status back to the caller.

But if the callback path submits another request, this
request may also get queued to the end of the queued_req
TAILQ.  This can result in an infinite loop.

The solution is to use an STAILQ_SWAP to a local, empty
STAILQ.  Then we ensure we only abort the requests that
were queued when _nvme_qpair_abort_queued_reqs() started
executing.

Fixes issue #1588.

I used the multipath.sh test to reproduce this on my local
system. If it ever dropped into the STAILQ loop in this
function, we would hit the infinite loop.  With this patch,
I confirmed locally that now we safely avoid the infinite
loop and the test passes.

Signed-off-by: default avatarJim Harris <james.r.harris@intel.com>
Change-Id: I657db23efe5983bd8613c870ad62695a7fc7f689
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4284


Tested-by: default avatarSPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: default avatarChangpeng Liu <changpeng.liu@intel.com>
Reviewed-by: default avatarShuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: default avatarZiye Yang <ziye.yang@intel.com>
Reviewed-by: default avatar <dongx.yi@intel.com>
Reviewed-by: default avatarBen Walker <benjamin.walker@intel.com>
parent 5198fbed
Loading
Loading
Loading
Loading
+7 −3
Original line number Diff line number Diff line
@@ -545,10 +545,14 @@ static void
_nvme_qpair_abort_queued_reqs(struct spdk_nvme_qpair *qpair, uint32_t dnr)
{
	struct nvme_request		*req;
	STAILQ_HEAD(, nvme_request)	tmp;

	while (!STAILQ_EMPTY(&qpair->queued_req)) {
		req = STAILQ_FIRST(&qpair->queued_req);
		STAILQ_REMOVE_HEAD(&qpair->queued_req, stailq);
	STAILQ_INIT(&tmp);
	STAILQ_SWAP(&tmp, &qpair->queued_req, nvme_request);

	while (!STAILQ_EMPTY(&tmp)) {
		req = STAILQ_FIRST(&tmp);
		STAILQ_REMOVE_HEAD(&tmp, stailq);
		if (!qpair->ctrlr->opts.disable_error_logging) {
			SPDK_ERRLOG("aborting queued i/o\n");
		}