Commit bfcfe71a authored by Shuhei Matsumoto's avatar Shuhei Matsumoto Committed by Jim Harris
Browse files

nvme_rdma: Ignore response if its QP was already destroyed



This is a workaround but is necessary to fix the github issue #2874.
Due to some unknown reason, in nightly test with Intel e810 NICs
when a qpair is created with synchronous mode and connection errors
are detected, the qpair is destroyed even if requests for the qpair are
still inflight. Then, nvme_rdma_process_recv_completion() causes NULL
pointer acccess. To fix this NULL pointer access, change
nvme_rdma_process_recv_completion() to return immediately if rsp->rqpair
is NULL. Add a TODO comment to find a root cause and really fix the
issue.

One of the fixes for the issue #2874.

Signed-off-by: default avatarShuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16431

 (master)

(cherry picked from commit bbd3d96b)
Change-Id: Ic810922f7ea1b32373b15f4e0cf7c2429659cbab
Signed-off-by: default avatarKrzysztof Karas <krzysztof.karas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16489


Reviewed-by: default avatarJim Harris <james.r.harris@intel.com>
Reviewed-by: default avatarTomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: default avatarSPDK CI Jenkins <sys_sgci@intel.com>
parent 443fa3b0
Loading
Loading
Loading
Loading
+8 −0
Original line number Diff line number Diff line
@@ -2508,7 +2508,15 @@ nvme_rdma_process_recv_completion(struct nvme_rdma_poller *poller, struct ibv_wc
		}
	} else {
		rqpair = rdma_rsp->rqpair;
		if (spdk_unlikely(!rqpair)) {
			/* TODO: Fix forceful QP destroy when it is not async mode.
			 * CQ itself did not cause any error. Hence, return 0 for now.
			 */
			SPDK_WARNLOG("QP might be already destroyed.\n");
			return 0;
		}
	}


	assert(rqpair->rsps->current_num_recvs > 0);
	rqpair->rsps->current_num_recvs--;