+72
−15
Loading
The following scenario might occur when nvmf_tgt is stopped: 1. nvmf_tgt receives SIGINT, changes state to NVMF_TGT_FINI_STOP_SUBSYSTEMS 2. In this state nvmf_tgt stops all subsystems and disconnects associated qpairs 3. In the case of RDMA qpair, its state will be changed to IBV_QPS_ERR. Once qpair changes the state to IBV_QPS_ERR, RDMA device generates LAST_WQE_REACHED event when there are no more WQE that can be sonsumed from the SRQ by this qpair. 4. When all subsystems are stopped, some of qpair may still be alive since they haven't received LAST_WQE_REACHED event yet. 5. nvmf_tgt stops all poll groups and forcefully destroyes any qpairs linked to them. 6. At this moment LAST_WQE_REACHED event might be generated and received in another thread. Handler of this event sends a message with a pointer to qpair. The qpair itself may already be destroyed. 7. Thread that owned qpair receives a message (LAST_WQE_REACHED) with a pointer to alredy destroyed qpair and destroyes it for the second time when all pointer are invalid. ibv events related to qpair should be handled by the thread that owns this qpair. This commit adds a new structure that describes ibv event, helper functions for sending the event and a list of events per rdma qpair; add syncronization for LAST_WQE_REACHED event Fixes #1075 Signed-off-by:Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by:
Sasha Kotchubievsky <sashakot@mellanox.com> Signed-off-by:
Evgeniy Kochetov <evgeniik@mellanox.com> Change-Id: I22bff89741708df2518760934ecb4e33fad49473 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476712 Tested-by:
SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by:
Seth Howell <seth.howell@intel.com> Reviewed-by:
Jim Harris <james.r.harris@intel.com> Reviewed-by:
Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by:
Ben Walker <benjamin.walker@intel.com>