如何为在Prometheus上的Kubernetes集群上运行的Pod查找有关CPU/MEM的指标 [英] How to find metrics about CPU/MEM for the pod running on a Kubernetes cluster on Prometheus

查看:184
本文介绍了如何为在Prometheus上的Kubernetes集群上运行的Pod查找有关CPU/MEM的指标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通过Terraform的Helm进行了Prometheus设置,并且已配置为连接到我的Kubernetes集群.我打开了Prometheus,但不确定从列表中选择哪个指标以查看正在运行的Pod/作业的CPU/MEM.以下是所有使用命令运行的Pod( test1 是kube 命名空间):

  kubectl -n test1获取容器 

将日历设置为上周五没有结果:

4月20日更新屏幕我试图选择开始日期为4月17日(星期六)的2天,但没有看到任何结果:

并且,如果我删除(namespace ="jobs")条件,也看不到任何结果:

我刚才再次尝试重新运行该作业(模拟作业),并尝试在该作业仍处于运行模式时执行prometheus查询,但未得到任何结果:-(在这里,您可以看到我的作业正在运行.

我没有任何结果:

使用简单过滤器时,只需 container_cpu_usage_seconds_total ,我可以看到namespace ="jobs"

解决方案

node_cpu_seconds_total 是来自 node-exporter 的度量标准,node-exporter 带来了计算机统计信息及其度量标准.前缀为 node _ .您需要来自 cAdvisor 的指标,该指标产生与容器相关的指标,并且以 container _ 为前缀:

  container_cpu_usage_seconds_totalcontainer_cpu_load_average_10scontainer_memory_usage_bytescontainer_memory_rss 

以下是一些基本查询供您入门.准备好可能需要进行调整(您可以使用不同的标签名称):

每个Pod的CPU使用率

  sum(irate(container_cpu_usage_seconds_total {container!="POD",container =〜.+"} [2m]))由(pod) 

每个Pod的RAM使用量

  sum((pod)sum(container_memory_usage_bytes {container!="POD",container =〜.+"}) 

每个舱的进/出流量率

请注意,使用 host 网络模式(未隔离)的pod会显示整个节点的流量速率. * 8 是为了方便起见将字节转换为位(MBit/s,GBit/s等).

 #传入sum(irate(container_network_receive_bytes_total [2m]))由(pod)* 8#外向sum(irate(container_network_transmit_bytes_total [2m]))由(pod)* 8 

I have Prometheus setup via Helm from Terraform and it's is configured to connect to my Kubernetes cluster. I open my Prometheus but I am not sure which metric to choose from the list to be able to view the CPU/MEM of running pods/jobs. Here are all the pods running with the command (test1 is the kube namespace):

kubectl -n test1 get pods

podsrunning

When, I am on Prometheus, I see many metrics related to CPU, but not sure which one to choose:

prom1

I tried to choose one, but the namespace = prometheus and it uses prometheus-node-exporter and I don't see my cluster or my namespace test1 anywhere here.

prom2

Could you please help me? Thank you very much in advance.

UPDATE SCREENSHOT UPDATE SCREENSHOT I need to concentrate on this specific namespace, normally with the command: kubectl get pods --all-namespaces | grep hermatwin I see the first line with namespace = jobs I think this is namespace.

No result when set calendar to last Friday:

UPDATE SCREENSHOT April 20 I tried to select 2 days with starting date on last Saturday 17 April but I don't see any result:

ANd, if I remove (namespace="jobs") condition, I don't see any result either:

I tried to rerun the job (simulation jobs) again just now and tried to execute the prometheus query while the job was still running mode but I don't get any result :-( Here you can see my jobs where running.

I don't get any result:

When using simple filter, just container_cpu_usage_seconds_total, I can see the namespace="jobs"

解决方案

node_cpu_seconds_total is a metric from node-exporter, the exporter that brings machine statistics and its metrics are prefixed with node_. You need metrics from cAdvisor, this one produces metrics related to containers and they are prefixed with container_:

container_cpu_usage_seconds_total
container_cpu_load_average_10s
container_memory_usage_bytes
container_memory_rss

Here are some basic queries for you to get started. Be ready that they may require tweaking (you may have different label names):

CPU Utilisation Per Pod

sum(irate(container_cpu_usage_seconds_total{container!="POD", container=~".+"}[2m])) by (pod)

RAM Usage Per Pod

sum(container_memory_usage_bytes{container!="POD", container=~".+"}) by (pod)

In/Out Traffic Rate Per Pod

Beware that pods with host network mode (not isolated) show traffic rate for the whole node. * 8 is to convert bytes to bits for convenience (MBit/s, GBit/s, etc).

# incoming
sum(irate(container_network_receive_bytes_total[2m])) by (pod) * 8
# outgoing
sum(irate(container_network_transmit_bytes_total[2m])) by (pod) * 8

这篇关于如何为在Prometheus上的Kubernetes集群上运行的Pod查找有关CPU/MEM的指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆