+15
−6
Loading
Supporting SRQ caused two kinds of memory leaks. Fix both in this patch. 1. rqpair->rsps was leaked and null pointer access occurred An error was detected during the nightly nvmf_delete_subsystem test. The NVMe perf tool crashed with SIGABRT. The reason of the crash was nvme_rdma.c:2504:2: runtime error: member access within null pointer of type 'struct nvme_rdma_rsps' This was caused by clearing rqpair->rsps before freeing rqpair->rsps. rqpair->rsps should have been held until rqpair->rsps is freed. However, when we support SRQ, rqpair->rsps was cleared when releasing rqpair->poller by mistake. rqpair->rsps should be cleared only if SRQ is enabled because in this case rqpair uses rsps of rqpair->poller. 2. rqpair->reqs and rsps are leaked for admin qpair at controller reset To avoid unnecessary alloc and free for rqpair->rsps when enabling SRQ, nvme_rdma_create_reqs() and nvme_rdma_create_rsps() were moved to nvme_rdma_connect_established(). On the other hand, nvme_rdma_free_reqs() and nvme_rdma_free_rsps() were called by nvme_rdma_ctrlr_delete_io_qpair(). However, at controller reset, admin qpair was just disconnected and reconnected. In this case, nvme_rdma_create_reqs() and nvme_rdma_create_rsps() were called again without calling nvme_rdma_free_reqs() and nvme_rdma_free_rsps(). Hence, memory leak occurred. To fix the memory leak, move nvme_rdma_free_reqs() and nvme_rdma_free_rsps() from nvme_rdma_ctrlr_delete_io_qpair() to nvme_rdma_qpair_destroy(). One of the fixes fot the issue #2874 Signed-off-by:Shuhei Matsumoto <smatsumoto@nvidia.com> Change-Id: I167ba908cff73d7a0be2248affce4c54f233da51 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/16384 Reviewed-by:
Aleksey Marchuk <alexeymar@nvidia.com> Reviewed-by:
Ben Walker <benjamin.walker@intel.com> Tested-by:
SPDK CI Jenkins <sys_sgci@intel.com>