Commit 0f57273a authored by Boris Glimcher's avatar Boris Glimcher Committed by Tomasz Zawadzki
Browse files

docker: add monitoring example



Spinning spdk, telegraf and prometheus containers.
telegraf fetches spdk bdev stats via `bdev_get_iostat`.
prometheus fetches metrics from telegraf.

README also updated to reflect this example.

Change-Id: I4e7347a1bcd7aca67b4e5b52ebbe662d99f29a3a
Signed-off-by: default avatarBoris Glimcher <Boris.Glimcher@emc.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/20332


Tested-by: default avatarSPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: default avatarJim Harris <jim.harris@samsung.com>
Reviewed-by: default avatarTomasz Zawadzki <tomasz.zawadzki@intel.com>
parent ae5e1aaf
Loading
Loading
Loading
Loading
+21 −0
Original line number Diff line number Diff line
@@ -94,6 +94,27 @@ docker-compose exec storage-target rpc.py bdev_get_bdevs
docker-compose exec proxy-container rpc.py nvmf_get_subsystems
~~~

## Monitoring

`docker-compose.monitoring.yaml` shows an example deployment of the storage containers based on SPDK.

Running `docker-compose -f docker-compose.monitoring.yaml up` creates 3 docker containers:

-- storage-target: Contains SPDK NVMe-oF target exposing single subsystem based on malloc bdev.
-- [telegraf](https://www.influxdata.com/time-series-platform/telegraf/) is a very minimal memory footprint agent for collecting and sending metrics and events.
-- [prometheus](https://prometheus.io/) is leading open-source monitoring solution.

`telegaf` connects to `spdk` via `rpc_http_proxy.py` and uses `bdev_get_iostat` commands to fetch bdev statistics.

In order to see data change, once all of the 3 containers are brought up, use `docker-compose run traffic-generator-nvme`` to generate some traffic.

Open Prometheus UI or query via cmdline. E.g.:

~~~{.sh}
curl --fail http://127.0.0.1:9090/api/v1/query?query=spdk_bytes_read
curl --fail http://127.0.0.1:9090/api/v1/query?query=spdk_bytes_written
~~~

## Caveats

- If you run docker < 20.10 under distro which switched fully to cgroups2
+55 −0
Original line number Diff line number Diff line
# SPDX-License-Identifier: Apache-2.0
# Copyright (C) 2021 Intel Corporation
# Copyright (c) 2023 Dell Inc, or its subsidiaries.
#

version: "3.8"
services:
  build_base:
    image: spdk
    build:
      context: build_base
    container_name: build_base

  storage-target:
    image: spdk-app
    build:
      context: spdk-app
    container_name: storage-target
    depends_on:
      - build_base
    networks:
      spdk:
        ipv4_address: 192.168.42.2
    volumes:
      - /dev/hugepages:/dev/hugepages
      - ./spdk-app/storage-target.conf:/config
    environment:
      - SPDK_HTTP_PROXY=0.0.0.0 9009 spdkuser spdkpass
    privileged: true

  telegraf:
    image: docker.io/library/telegraf:1.28
    volumes:
      - ./monitoring/telegraf.conf:/etc/telegraf/telegraf.conf:ro
    depends_on:
      - storage-target
    networks:
      spdk:

  prometheus:
    image: docker.io/prom/prometheus:v2.47.1
    volumes:
      - ./monitoring/prometheus.yaml:/etc/prometheus/prometheus.yml:z
    depends_on:
      - telegraf
    networks:
      spdk:

networks:
  spdk:
    name: "spdk"
    ipam:
      config:
        - subnet: 192.168.42.0/29
          gateway: 192.168.42.1
+10 −0
Original line number Diff line number Diff line
# SPDX-License-Identifier: Apache-2.0
# Copyright (c) 2023 Dell Inc, or its subsidiaries.
#

scrape_configs:
  - job_name: telegraf
    metrics_path: /metrics
    static_configs:
      - targets:
          - telegraf:9126
+27 −0
Original line number Diff line number Diff line
# SPDX-License-Identifier: Apache-2.0
# Copyright (c) 2023 Dell Inc, or its subsidiaries.
#

[[inputs.http]]
  urls = ["http://storage-target:9009"]
  headers = {"Content-Type" = "application/json"}
  method = "POST"
  username = "spdkuser"
  password = "spdkpass"
  body = '{"id":1, "method": "bdev_get_iostat"}'
  data_format = "json"
  name_override = "spdk"
  json_strict = true
  tag_keys = ["name"]
  json_query = "result.bdevs"

[[outputs.file]]
  files = ["stdout"]
  data_format = "influx"

[[outputs.prometheus_client]]
  listen = ":9126"
  metric_version = 2
  path="/metrics"
  string_as_label = true
  export_timestamp = true
+4 −0
Original line number Diff line number Diff line
@@ -33,4 +33,8 @@ fi
# Wait a bit to make sure ip is in place
sleep 2s

if [[ -n $SPDK_HTTP_PROXY ]]; then
	rpc_http_proxy.py $SPDK_HTTP_PROXY &
fi

exec "$app" "${args[@]}"