VMware监控的实现
安装VMware_exporter
组件GitHub地址:https://github.com/pryorda/vmware_exporter
详细的参数说明请查看Github项目仓库中的说明文件
本次使用容器部署,所以只要准备好配置文件添加到容器内启动即可,配置文件如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
| [root@VM-24-14-centos vmware_exporter] /usr/local/vmware_exporter [root@VM-24-14-centos vmware_exporter] default: vsphere_host: ip vsphere_user: "username" vsphere_password: "password" ignore_ssl: True specs_size: 5000 fetch_custom_attributes: True fetch_tags: True fetch_alarms: True collect_only: vms: True vmguests: True datastores: True hosts: True snapshots: True vcenter1: vsphere_host: ip vsphere_user: "username" vsphere_password: "password" ignore_ssl: True specs_size: 5000 fetch_custom_attributes: True fetch_tags: True fetch_alarms: True collect_only: vms: True vmguests: True datastores: True hosts: True snapshots: True vcenter2: vsphere_host: ip vsphere_user: 'username' vsphere_password: 'password' ignore_ssl: True specs_size: 5000 fetch_custom_attributes: True fetch_tags: True fetch_alarms: True collect_only: vms: True vmguests: True datastores: True hosts: True snapshots: True [root@VM-24-14-centos vmware_exporter] [root@VM-24-14-centos vmware_exporter] [root@VM-24-14-centos vmware_exporter] f8sd7a0dc9e6d6f5cc82288fa83f3bfae3b6a8f832b9d95dba41fd5384dc21f07 [root@VM-24-14-centos vmware_exporter] [root@VM-24-14-centos vmware_exporter] CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f8sd7a0dc9e6 pryorda/vmware_exporter "/usr/local/bin/vmwa…" 50 seconds ago Up 39 seconds 0.0.0.0:9272->9272/tcp vmware_exporter [root@VM-24-14-centos vmware_exporter]
|
可通过下面的url在浏览器访问,检查是否配置启动成功,如有错误需要自行查看组件日志进行排查
http://PrometheusIP:9272/metrics?section=vcenter1&target=ip\
curl -XPOST http://localhost:9090/-/reload
Prometheus增加VMware监控任务
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| [root@VM-24-14-centos vmware_exporter] - job_name: 'vmware_vcenter_1' metrics_path: '/metrics' static_configs: - targets: - 'ip' params: section: [vcenter1] relabel_configs: - source_labels: [__address__] target_label: Prometheus - source_labels: [Prometheus] target_label: instance - target_label: __address__ replacement: PrometheusIP:9272 - job_name: 'vmware_vcenter_2' metrics_path: '/metrics' static_configs: - targets: - 'ip' params: section: [vcenter2] relabel_configs: - source_labels: [__address__] target_label: Prometheus - source_labels: [Prometheus] target_label: instance - target_label: __address__ replacement: PrometheusIP:9272
|
Grafana图表配置
图表是从项目中grafana的图表json和grafana官网中图表下载,因公司集群字段与图表过滤字段不一致,需自定义配置。
json: https://github.com/pryorda/vmware_exporter/tree/main/dashboards
exsi: https://grafana.com/grafana/dashboards/10076
VMware stats: https://grafana.com/grafana/dashboards/11243
告警策略
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
| - name: alert_VMware rules: - alert: 监控告警-VMware-存储使用率告警 expr: (vmware_datastore_capacity_size - vmware_datastore_freespace_size) * 100 / vmware_datastore_capacity_size >= 98 for: 2m labels: serverity: critical environment: ops annotations: summary: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware存储使用率超过98%!" description: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware存储使用率超过98%!" resolved: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware存储使用率超过98%,已恢复!" grafana_url: ""
- alert: 监控告警-VMware-宿主机CPU使用率告警 expr: vmware_host_cpu_usage * 100 / vmware_host_cpu_max >= 80 for: 2m labels: serverity: critical environment: ops annotations: summary: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware宿主机CPU使用率超过80%!" description: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware宿主机CPU使用率超过80%!" resolved: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware宿主机CPU使用率超过80%,已恢复!" grafana_url: ""
- alert: 监控告警-VMware-宿主机内存使用率告警 expr: vmware_host_memory_usage* 100 / vmware_host_memory_max > 98 for: 2m labels: serverity: critical environment: ops annotations: summary: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware宿主机内存使用率超过98%!" description: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware宿主机内存使用率超过98%!" resolved: "{{$labels.project}} {{$labels.instance}}:{{$labels.project}} VMware宿主机内存使用率超过98%,已恢复!" grafana_url: ""
|