0%

初尝Prometheus与Grafana监控Docker

监控体系需要的四个服务

  • n * Node Exporter (收集Host硬件和操作系统信息)
  • n * cAdvisor (负责收集Host上运行的容器信息)
  • 1 * Prometheus Server(普罗米修斯监控主服务器 )
  • 1 * Grafana (展示普罗米修斯监控界面)

被监控的服务器上启动Node Exporter (收集Host硬件和操作系统信息) 以及 cAdvisor (负责收集Host上运行的容器信息)即可

然后在prometheus.yml配置文件中配置上targets地址即可。

规划

目前只有一个Host(192.168.0.108)

NodeExporter端口9100

cAdvisor端口8080

Prometheus端口9090

Grafana端口3000

部署规划

启动NodeExporter

https://github.com/prometheus/node_exporter/

1
2
3
4
5
6
7
8
9
10
11
docker run -d -p 9100:9100 \
-v "/proc:/host/proc" \
-v "/sys:/host/sys" \
-v "/:/rootfs" \
-v "/etc/localtime:/etc/localtime" \
--net=host \
--name=node-exporter \
prom/node-exporter \
--path.procfs /host/proc \
--path.sysfs /host/sys \
--collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($|/)"

启动cAdvisor

https://github.com/google/cadvisor

1
2
3
4
5
6
7
8
9
10
11
docker run -d \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
--net=host \
-v "/etc/localtime:/etc/localtime" \
google/cadvisor:latest

启动Prometheus

Prometheus - Monitoring system & time series database

prometheus的配置文件,主要是填写监听的地址(所有被监控机器的NodeExporter服务与cAdvisor服务的列表)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: "docker-cluster"
static_configs:
- targets: ["192.168.0.108:9100","192.168.0.108:8080"]

启动,注意映射配置文件的目录

1
2
3
4
5
6
docker run -d -p 9090:9090 \
-v /opt/sc/runner/prometheus.yml:/etc/prometheus/prometheus.yml \
-v "/etc/localtime:/etc/localtime" \
--name prometheus \
--net=host \
prom/prometheus

启动Grafana

1
2
3
4
5
6
7
docker run -d -i -p 3000:3000 \
-v "/etc/localtime:/etc/localtime" \
-e "GF_SERVER_ROOT_URL=http://grafana.server.name" \
-e "GF_SECURITY_ADMIN_PASSWORD=admin8888" \
--name grafana \
--net=host \
grafana/grafana

用户名密码admin/admin8888

配置Prometheus作为Datasource

浏览器访问Gafana的地址192.168.0.108:3000。

Configuration - Data sources - Add data source - 选择Prometheus类型 - 填写URL为:192.168.0.108:9090即可,其它可以默认

导入Dashborad

在下面网站搜索到想要添加的Dashborad。

推荐179,893,8919

Grafana Dashboards - discover and share dashboards for Grafana. | Grafana Labs

Docker 启动 cAdvisor 报错问题解决

Failed to start container manager: inotify_add_watch
/sys/fs/cgroup/cpuacct,cpu: no such file or directory

1
2
mount -o remount,rw '/sys/fs/cgroup'
ln -s /sys/fs/cgroup/cpu,cpuacct /sys/fs/cgroup/cpuacct,cpu

然后重启容器即可。sd