Commit af5a744d authored by Yash Raj Singh's avatar Yash Raj Singh Committed by Tomasz Zawadzki
Browse files

lib/nvme: Fix SIGSEGV due to race between AER and controller ID Desc.



This patch fixes the SIGSEGV condition inside
spdk_nvme_ctrlr_identify_id_desc_async function. This occurs when AER is
also being processed while we query ID descriptors iteratively for all
the namespaces. The issue arises from the fact that the newly added
namespace from the AER request may not be fully initialized when we
query for its ID descriptor, leading to a NULL pointer dereference.
We fix this issue with now setting the controller pointer in the
namespace structure when it is added to the controller's namespace
tree. Thus, avoiding the SIGSEGV condition.

More details as to why the issue occurs:
nvme controller is a state machine and during initialization goes
through multiple stages, like NVME_CTRLR_STATE_CONNECT_ADMINQ,..
NVME_CTRLR_STATE_CONFIGURE_AER(enable AER), ...
NVME_CTRLR_STATE_IDENTIFY_ID_DESCS (Offending state) etc.

Now what happens is that when we are in the
NVME_CTRLR_STATE_IDENTIFY_ID_DESCS state, we issue
nvme_ctrlr_identify_id_desc_async on all the active namespaces by
iterating on the nvme_ns_tree inside spdk_nvme_ctrlr.

In the meanwhile since AER is also enabled, we can receive events that
new namespaces are added or removed. Assume the below codeflow for AER
to understand better->
1. nvme_ctrlr_complete_queued_async_events
2. nvme_ctrlr_process_async_event
     1. nvme_ctrlr_identify_active_ns
	      1. nvme_ctrlr_identify_active_ns_async
		  2. nvme_ctrlr_identify_active_ns_swap - add namespaces
		     discovered from the array above and add them in the
			 nvme_ns_tree and make them active. (note the ctrlr field of
			 spdk_nvme_ns is not set yet) -->(A)
	 2. nvme_ctrlr_update_namespaces
	      1. nvme_ns_construct
		       1. nvme_ctrlr_identify_ns - issues identify on the
			      namespace and process admin queue completions. --(B)

Now when we reach to point (B) after executing (A), the namespace is not
fully initialized yet, as the ctrlr field in the spdk_nvme_ns is not set
yet. Thus, when we try to access the ctrlr field in the spdk_nvme_ns
inside the nvme_ctrlr_identify_id_desc_async, we get a NULL pointer
dereference, leading to SIGSEGV.

Change-Id: Ib4f39b016cb8bfa636195f09b3750dc6f90b1667
Signed-off-by: default avatarYash Raj Singh <yash.rajsingh@nutanix.com>
Reviewed-on: https://review.spdk.io/c/spdk/spdk/+/26381


Community-CI: Mellanox Build Bot
Tested-by: default avatarSPDK Automated Test System <spdkbot@gmail.com>
Reviewed-by: default avatarJim Harris <jim.harris@nvidia.com>
Reviewed-by: default avatarTomasz Zawadzki <tomasz@tzawadzki.com>
parent 767d8f73
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -4741,6 +4741,7 @@ spdk_nvme_ctrlr_get_ns(struct spdk_nvme_ctrlr *ctrlr, uint32_t nsid)

		NVME_CTRLR_DEBUGLOG(ctrlr, "Namespace %u was added\n", nsid);
		ns->id = nsid;
		ns->ctrlr = ctrlr;
		RB_INSERT(nvme_ns_tree, &ctrlr->ns, ns);
	}