Commit ff9642e3 authored by Michal Berger's avatar Michal Berger Committed by Tomasz Zawadzki
Browse files

test/nvme/cuse: Include all nvme devices in the test



If there are any PCI_BLOCKED devices that were picked up initially by
the autotest env, it may interfere with the nvme-cli tests.

Since nvme-cli doesn't really care about our PCI_BLOCKED setups, it
will always include all the nvme ctrls, this is also true for the SPDK
plugin.

Consider scenario like this:

  nvme0 0000:5e:00.0
  nvme1 0000:5f:00.0

  PCI_BLOCKED=0000:5f:00.0

This was the setup under the affected node which caused the #3146
issue to pop up - the 0000:5f:00.0 was a zoned nvme device which
autotest, for a good reason, puts on the PCI_BLOCKED list.

When we enter the cuse tests, the first setup.sh reset yields:
  nvme0 -> nvme
  nvme1 -> can't touch, stays under the kernel

nvme-cli will now include all nvme0, nvme1 devices. Depending on the
usage, the order may look like the following:

/dev/nvme1n1 ... 5GB  -output cleanup-> nvme1n1
/dev/nvme0n1 ... 20GB -output cleanup-> nvme0n1

scan_nvme_ctrls() will skip now the nvme1 as it's listed under
PCI_BLOCKED.

So the next setup.sh config yields:
  nvme0 -> vfio-pci
  nvme1 -> can't touch, stays under the kernel

The ordered_ctrls[@] now keeps only the nvme0, so the SPDK|cuse
will attach only to that device. The nvme0 pops first on the
list, then the plugin lists remaining nvme devices (which were
not attached to SPDK) so we get:

/dev/spdk/nvme0* -output cleanup-> nvme0
/dev/nvme1n1     -output cleanup-> nvme1

We can see that the order now changed. Now, sorting could fix that
issue, but the main problem here is that the test is unintentionally
skipping nvme devices - this is relevant in the sense that the output
from the plugin is now completely obscured, i.e., we don't know what
devices were indeed the SPDK ones. With that in mind, when ALL nvmes
are moved to SPDK, the plugin will list them in the right order (as
nvme-cli listing the kernel devices).

So to fix this, we should allow for the test to always pick all the
nvme devices, since in the scope of it the specifics of these devices
(e.g. if it's a zoned device, etc.) should not really matter.

Also, when the test finishes, we need to make sure all devices are
moved back to the kernel so remaining autotest suites can properly
enforce PCI_BLOCKED.

Change-Id: I219e33b1e90da2102bc9ac6fa1657f270e5f42e5
Signed-off-by: default avatarMichal Berger <michal.berger@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/20375


Reviewed-by: default avatarJim Harris <jim.harris@samsung.com>
Reviewed-by: default avatarKarol Latecki <karol.latecki@intel.com>
Reviewed-by: default avatarKonrad Sztyber <konrad.sztyber@intel.com>
Tested-by: default avatarSPDK CI Jenkins <sys_sgci@intel.com>
parent e8afbb64
Loading
Loading
Loading
Loading
+7 −1
Original line number Diff line number Diff line
@@ -7,7 +7,8 @@ testdir=$(readlink -f $(dirname $0))
rootdir=$(readlink -f $testdir/../../..)
source "$testdir/common.sh"

trap 'killprocess $spdk_tgt_pid' EXIT
# Give the devices back to the kernel at the end
trap 'killprocess $spdk_tgt_pid; "$rootdir/scripts/setup.sh" reset' EXIT

nvme() {
	# Apply some custom filters to align output between the plugin's listing and base nvme-cli's
@@ -29,6 +30,11 @@ cuse_out=()

rpc_py=$rootdir/scripts/rpc.py

# We need to make sure these tests don't discriminate against the PCI_BLOCKED devices
# since nvme-cli doesn't really care - to make sure all outputs are aligned, we need to
# include all the devices that we can find.
export PCI_BLOCKED=""

"$rootdir/scripts/setup.sh" reset
scan_nvme_ctrls

+2 −0
Original line number Diff line number Diff line
@@ -11,6 +11,8 @@ source $rootdir/test/common/autotest_common.sh
SMARTCTL_CMD='smartctl -d nvme'
rpc_py=$rootdir/scripts/rpc.py

"$rootdir/scripts/setup.sh"

bdf=$(get_first_nvme_bdf)

PCI_ALLOWED="${bdf}" $rootdir/scripts/setup.sh reset