如何使用prometheus作为监控计算kubernetes中容器的cpu使用率? [英] How to calculate containers' cpu usage in kubernetes with prometheus as monitoring?
问题描述
我想计算一个 kubernetes 集群中所有 pod 的 cpu 使用率.我发现 prometheus 中的两个指标可能有用:
container_cpu_usage_seconds_total:每个 cpu 消耗的累积 cpu 时间(以秒为单位).process_cpu_seconds_total:用户和系统 CPU 花费的总时间(以秒为单位).所有 pod 的 Cpu 使用率 = sum(container_cpu_usage_seconds_total{id="/"}) 的每秒增量/sum(process_cpu_seconds_total) 的每秒增量
然而,我发现 container_cpu_usage{id="/"}
的每秒增量大于 sum(process_cpu_seconds_total)
的增量.所以使用量可能大于1...
我用这个来获取集群级别的 CPU 使用率:
sum (rate (container_cpu_usage_seconds_total{id="/"}[1m]))/sum (machine_cpu_cores) * 100
我还跟踪每个 Pod 的 CPU 使用情况.
sum (rate (container_cpu_usage_seconds_total{image!=""}[1m])) by (pod_name)
我在 GitHub 上有一个完整的 kubernetes-prometheus 解决方案,也许可以帮助您了解更多指标:
I want to calculate the cpu usage of all pods in a kubernetes cluster. I found two metrics in prometheus may be useful:
container_cpu_usage_seconds_total: Cumulative cpu time consumed per cpu in seconds.
process_cpu_seconds_total: Total user and system CPU time spent in seconds.
Cpu Usage of all pods = increment per second of sum(container_cpu_usage_seconds_total{id="/"})/increment per second of sum(process_cpu_seconds_total)
However, I found every second's increment of container_cpu_usage{id="/"}
larger than the increment of sum(process_cpu_seconds_total)
. So the usage may be larger than 1...
This I'm using to get CPU usage at cluster level:
sum (rate (container_cpu_usage_seconds_total{id="/"}[1m])) / sum (machine_cpu_cores) * 100
I also track the CPU usage for each pod.
sum (rate (container_cpu_usage_seconds_total{image!=""}[1m])) by (pod_name)
I have a complete kubernetes-prometheus solution on GitHub, maybe can help you with more metrics: https://github.com/camilb/prometheus-kubernetes
这篇关于如何使用prometheus作为监控计算kubernetes中容器的cpu使用率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!