Commit e745bb65 authored by Theo Jepsen's avatar Theo Jepsen Committed by Tomasz Zawadzki
Browse files

doc/nvmf: nvmf multipath documentation



Signed-off-by: default avatarTheo Jepsen <theo.jepsen@intel.com>
Change-Id: Iff7d6e3aaf3c078647f70a9a63584e12cd8356ea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/14042


Reviewed-by: default avatarBen Walker <benjamin.walker@intel.com>
Reviewed-by: default avatarJohn Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: default avatarShuhei Matsumoto <smatsumoto@nvidia.com>
Tested-by: default avatarSPDK CI Jenkins <sys_sgci@intel.com>
parent 629135a7
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -834,6 +834,7 @@ INPUT += \
                         nvmf.md \
                         nvmf_tgt_pg.md \
                         nvmf_tracing.md \
                         nvmf_multipath_howto.md \
                         overview.md \
                         peer_2_peer.md \
                         pkgconfig.md \
+5 −0
Original line number Diff line number Diff line
@@ -270,3 +270,8 @@ nvme disconnect -n "nqn.2016-06.io.spdk:cnode1"

SPDK has a tracing framework for capturing low-level event information at runtime.
@ref nvmf_tgt_tracepoints enable analysis of both performance and application crashes.

## Enabling NVMe-oF Multipath

The SPDK NVMe-oF target and initiator support multiple independent paths to the same NVMe-oF subsystem.
For step-by-step instructions for configuring and switching between paths, see @ref nvmf_multipath_howto .
+104 −0
Original line number Diff line number Diff line
# NVMe-oF Multipath HOWTO {#nvmf_multipath_howto}

This HOWTO provides step-by-step instructions for setting-up a simple SPDK deployment and testing multipath.
It demonstrates configuring path preferences with Asymmetric Namespace Access (ANA), as well as round-robin
path load balancing.

## Build SPDK on both the initiator and target servers

Clone the repo:
~~~{.sh}
git clone https://github.com/spdk/spdk
~~~

Configure and build SPDK:
~~~{.sh}
cd spdk/
git submodule update --init
./configure
make -j16
~~~

## Setup hugepages

This should be run once on each server (and after reboots):
~~~{.sh}
cd spdk/
./scripts/setup.sh
~~~

## On target: start and configure SPDK

Start the target in the background and configure it:
~~~{.sh}
cd spdk/
./build/bin/nvmf_tgt -m 0x3 &
./scripts/rpc.py nvmf_create_transport -t tcp -o -u 8192
~~~

Create a subsystem, with `-r` to enable ANA reporting feature:
~~~{.sh}
./scripts/rpc.py nvmf_create_subsystem nqn.2022-02.io.spdk:cnode0 -a -s SPDK00000000000001 -r
~~~

Create and add a malloc block device:
~~~{.sh}
./scripts/rpc.py bdev_malloc_create 64 512 -b Malloc0
./scripts/rpc.py nvmf_subsystem_add_ns nqn.2022-02.io.spdk:cnode0 Malloc0
~~~

Add two listeners, each with a different `IP:port` pair:
~~~{.sh}
./scripts/rpc.py nvmf_subsystem_add_listener -t tcp -a 172.17.1.13 -s 4420 nqn.2022-02.io.spdk:cnode0
./scripts/rpc.py nvmf_subsystem_add_listener -t tcp -a 172.18.1.13 -s 5520 nqn.2022-02.io.spdk:cnode0
~~~

## On initiator: start and configure bdevperf

Launch the bdevperf process in the background:
~~~{.sh}
cd spdk/
./test/bdev/bdevperf/bdevperf -m 0x4 -z -r /tmp/bdevperf.sock -q 128 -o 4096 -w verify -t 90 &> bdevperf.log &
~~~

Configure bdevperf and add two paths:
~~~{.sh}
./scripts/rpc.py -s /tmp/bdevperf.sock bdev_nvme_set_options -r -1
./scripts/rpc.py -s /tmp/bdevperf.sock bdev_nvme_attach_controller -b Nvme0 -t tcp -a 172.17.1.13 -s 4420 -f ipv4 -n nqn.2022-02.io.spdk:cnode0 -l -1 -o 10
./scripts/rpc.py -s /tmp/bdevperf.sock bdev_nvme_attach_controller -b Nvme0 -t tcp -a 172.18.1.13 -s 5520 -f ipv4 -n nqn.2022-02.io.spdk:cnode0 -x multipath -l -1 -o 10
~~~

## Launch a bdevperf test

Connect to the RPC socket of the bdevperf process and start the test:
~~~{.sh}
PYTHONPATH=$PYTHONPATH:/root/src/spdk/python ./test/bdev/bdevperf/bdevperf.py -t 1 -s /tmp/bdevperf.sock perform_tests
~~~

The RPC command will return, leaving the test to run for 90 seconds in the background. On the target server,
observe that only the first path (port) is receiving packets by checking the queues with `ss -t`.

You can view the paths available to the initiator with:
~~~{.sh}
./scripts/rpc.py -s /tmp/bdevperf.sock bdev_nvme_get_io_paths -n Nvme0n1
~~~

## Switching paths

This can be done on the target server by setting the first path's ANA to `non_optimized`:
~~~{.sh}
./scripts/rpc.py nvmf_subsystem_listener_set_ana_state nqn.2022-02.io.spdk:cnode0 -t tcp -a 172.17.1.13 -s 4420 -n non_optimized
~~~

Use `ss -t`  to verify that the traffic has switched to the second path.

## Use round-robin (active_active) path load balancing

First, ensure the ANA for both paths is configured as `optimized` on the target. Then, change the
multipath policy on the initiator to `active_active` (multipath policy is per bdev, so
`bdev_nvme_set_multipath_policy` must be called after `bdev_nvme_attach_controller`):
~~~{.sh}
./scripts/rpc.py -s /tmp/bdevperf.sock bdev_nvme_set_multipath_policy -b Nvme0n1 -p active_active
~~~

Observe with `ss -t` that both connections are receiving traffic (queues build up).