Prometheus 中的增加()有时会使值加倍:如何避免? [英] increase() in Prometheus sometimes doubles values: how to avoid?

查看:102
本文介绍了Prometheus 中的增加()有时会使值加倍:如何避免?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现对于某些图表,我从 Prometheus 获得了 doubles 值,而这些值应该只是一个:

I've found that for some graphs I get doubles values from Prometheus where should be just ones:

我使用的查询:

increase(signups_count[4m])

抓取间隔设置为推荐的最大值2 分钟.

Scrape interval is set to the recommended maximum of 2 minutes.

如果我查询存储的实际数据:

If I query the actual data stored:

curl -gs 'localhost:9090/api/v1/query?query=(signups_count[1h])'

"values":[
     [1515721365.194, "579"],
     [1515721485.194, "579"],
     [1515721605.194, "580"],
     [1515721725.194, "580"],
     [1515721845.194, "580"],
     [1515721965.194, "580"],
     [1515722085.194, "580"],
     [1515722205.194, "581"],
     [1515722325.194, "581"],
     [1515722445.194, "581"],
     [1515722565.194, "581"]
],

我看到只有两次增加.事实上,如果我查询这些时间,我会看到预期的结果:

I see that there were just two increases. And indeed if I query for these times I see an expected result:

curl -gs 'localhost:9090/api/v1/query_range?step=4m&query=increase(signups_count[4m])&start=1515721965.194&end=1515722565.194'

"values": [
     [1515721965.194, "0"],
     [1515722205.194, "1"],
     [1515722445.194, "0"]
],

但是 Grafana(和 GUI 中的 Prometheus)倾向于在查询中设置不同的step,对于不熟悉 Prometheus 内部工作的人来说,我得到了一个非常意外的结果.

But Grafana (and Prometheus in the GUI) tends to set a different step in queries, with which I get a very unexpected result for a person unfamiliar with internal workings of Prometheus.

curl -gs 'localhost:9090/api/v1/query_range?step=15&query=increase(signups_count[4m])&start=1515721965.194&end=1515722565.194'

... skip ...
 [1515722190.194, "0"],
 [1515722205.194, "1"],
 [1515722220.194, "2"],
 [1515722235.194, "2"],
... skip ...

知道 increase() 只是 arate() 函数 的特定用例的语法糖,我想这就是在特定情况下它应该如何工作.

Knowing that increase() is just a syntactic sugar for a specific use-case of the rate() function, I guess this is how it is supposed to work given the circumstances.

如何避免这种情况?大多数时候,我如何让 Prometheus/Grafana 向我展示一个对一个,两个对两个?除了增加抓取间隔(这将是我的最后手段).

How to avoid such situations? How do I make Prometheus/Grafana show me ones for ones, and twos for twos, most of the time? Other than by increasing the scrape interval (this will be my last resort).

我了解 Prometheus 不是精确的某种工具,所以如果我不是在任何时候都有一个好的数字,但大多数时候我都可以.

I understand that Prometheus isn't an exact sort of tool, so it is fine with me if I would have a good number not at all times, but most of the time.

我还缺少什么?

推荐答案

这被称为 别名 并且是信号处理中的一个基本问题.您可以通过提高采样率来改善这一点,4m 范围与 2m 范围有点短.尝试 10m 的范围.

This is known as aliasing and is a fundamental problem in signal processing. You can improve this a bit by increasing your sample rate, a 4m range is a bit short with a 2m range. Try a 10m range.

例如,在 1515722220 执行的查询仅看到 580@1515722085.194 和 581@1515722205.194 样本.这是在 2 分钟内增加了 1,在 4 分钟内推断增加了 2 - 这是预期的.

Here for example the query executed at 1515722220 only sees the 580@1515722085.194 and 581@1515722205.194 samples. That's an increase of 1 over 2 minutes, which extrapolated over 4 minutes is an increase of 2 - which is as expected.

任何基于指标的监控系统都会有类似的工件,如果您想要 100% 的准确度,则需要日志.

Any metrics-based monitoring system will have similar artifacts, if you want 100% accuracy you need logs.

这篇关于Prometheus 中的增加()有时会使值加倍:如何避免?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆