Commit eaf525cb authored by Alexey Marchuk's avatar Alexey Marchuk Committed by Tomasz Zawadzki
Browse files

nvme: Don't process poll group recursively



If TCP/RDMA qpair is connected to a poll group and
is waiting for FABRIC_CONNECT response, it calls
nvme_fabric_qpair_connect_poll which calls
spdk_nvme_poll_group_process_completions, so
the last function is called recursively.
It can lead to various problems, e.g. in case
of TCP, we may have the following situation:
1. nvme_tcp_poll_group_process_completions calls
spdk_sock_group_poll and last function reaps
events for 2 socketsi, e.g. sA and sB.
2. sA calls cb_fn (nvme_tcp_qpair_sock_cb) of
its qpair qA. Assuming that qpair is in connecting
state, it recursively calls nvme_tcp_poll_group_process_completions
3. nvme_tcp_poll_group_process_completions processes
needs_poll list and calls nvme_tcp_qpair_sock_cb
for qB. qB may also be in connecting state and
assume that it receives FABRIC_CONNECT cpl
with error, so qB starts disconnecting.
4. Discconnecting of qB sets sB cb_fn to
NULL.
5. Stack unwinded and returns to step 1
where next socket is sB which was previously
disconnected and its cb_fn is NULL
6. Null pointer dereferencing occurs.

Signed-off-by: default avatarAlexey Marchuk <alexeymar@nvidia.com>
Change-Id: Ia0d2e6d3cc042e2c577efd2ebf5e0d73300311fb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/18785


Reviewed-by: default avatarKonrad Sztyber <konrad.sztyber@intel.com>
Tested-by: default avatarSPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: default avatarShuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: default avatarChangpeng Liu <changpeng.liu@intel.com>
parent af130056
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -488,6 +488,7 @@ struct spdk_nvme_poll_group {
	void						*ctx;
	struct spdk_nvme_accel_fn_table			accel_fn_table;
	STAILQ_HEAD(, spdk_nvme_transport_poll_group)	tgroups;
	bool						in_process_completions;
};

struct spdk_nvme_transport_poll_group {
+6 −0
Original line number Diff line number Diff line
@@ -125,6 +125,11 @@ spdk_nvme_poll_group_process_completions(struct spdk_nvme_poll_group *group,
		return -EINVAL;
	}

	if (spdk_unlikely(group->in_process_completions)) {
		return 0;
	}
	group->in_process_completions = true;

	STAILQ_FOREACH(tgroup, &group->tgroups, link) {
		local_completions = nvme_transport_poll_group_process_completions(tgroup, completions_per_qpair,
				    disconnected_qpair_cb);
@@ -136,6 +141,7 @@ spdk_nvme_poll_group_process_completions(struct spdk_nvme_poll_group *group,
			assert(num_completions >= 0);
		}
	}
	group->in_process_completions = false;

	return error_reason ? error_reason : num_completions;
}