Prometheus Metrics Overview on Grafana
In this post, some variables defined in Grafana are used for Prometheus metrics, including
$__rate_interval
: This article describes the benefit of this variable
Kubernetes Metrics
These metrics require installing some of followings:
Node metrics
CPU utilization per node:
1 - (avg by (instance)(rate(node_cpu_seconds_total{mode="idle"}[$__rate_interval])))
Memory utilization per node:
1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)
Disk utilization per node:
1 - (node_filesystem_avail_bytes / node_filesystem_size_bytes)
Number of pods with a certain phases on a node (from this comment):
sum by(node)(kube_pod_info{} * on(pod, namespace) group_right(node) kube_pod_status_phase{phase="$phase"})
Pod metrics
CPU utilization per container:
sum by (container)(rate(container_cpu_usage_seconds_total{}[$__rate_interval]))
CPU usages against request:
sum by (namespace, container)(rate(container_cpu_usage_seconds_total{}[$__rate_interval])) / sum by (namespace, container)(kube_pod_container_resource_requests{resource="cpu", unit="core"})
CPU throttling:
sum by (namespace, container)(rate(container_cpu_cfs_throttled_periods_total{}[$__rate_interval])) / sum by (namespace, container)(rate(container_cpu_cfs_periods_total{}[$__rate_interval]))
CPU requests per namespace:
sum by (exported_namespace)(kube_pod_container_resource_requests{resource="cpu", unit="core"})
Memory utilization per container:
- Max:
max by (namespace, container)(container_memory_working_set_bytes{})
- Median:
quantile by (namespace, container)(0.5, container_memory_working_set_bytes{})
- Min:
min by (namespace, container)(container_memory_working_set_bytes{})
- Max:
Memory requests per namespace:
sum by (exported_namespace)(kube_pod_container_resource_requests{resource="memory", unit="byte"})
Persistent volumes
The usage
sum by (persistentvolumeclaim)(kubelet_volume_stats_used_bytes) / sum by (persistentvolumeclaim)(kubelet_volume_stats_capacity_bytes)
Grafana configurations
The scrape interval of Prometheus
In order to use $__rate_interval
, the scrape interval of the Prometheus datasource should match the scrape interval of the Prometheus.
On Grafana, it’s 15 seconds as the default, while on Prometheus, it’s 1m.
To configure it on the Grafana on a yaml file, update jsonData.timeInterval field on the Prometheus data source. This was also answered in this stackoverview answer.
Reference
Following some articles including followings