PromQL 查询中的动态阈值 [英] Dynamic Thresholds in a PromQL query

查看:126
本文介绍了PromQL 查询中的动态阈值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下定义

    - record: node_load5:node_cpu_seconds_total:critical
      expr: 3

    - record: node_load5:node_cpu_seconds_total:critical
      expr: 5
      labels:
        app: loki

    - record: Node_High_LoadAverage:Query
      expr: ((node_load5 / count without (cpu, mode) (node_cpu_seconds_total{mode="system"})))

并希望在 Prometheus 警报中使用查询,通过标签指定的值或默认值评估阈值.

and want to use a query in an Prometheus alert that evaulates the threshold by the value specified by the label or a default value.

我可以将 node_load5:node_cpu_seconds_total:critical 评估为两个不同的值,并使用 Node_High_LoadAverage:Query 来缩写查询.

I can evaluate the node_load5:node_cpu_seconds_total:critical to two different values and use Node_High_LoadAverage:Query to abbreviate a query.

我正在尝试使用 group_lefton 来执行连接.这是我的查询无效.

I'm trying to use group_left and on to perform a join. This is my query that isn't working.

Node_High_LoadAverage:Query > on (app) group_left node_load5:node_cpu_seconds_total:critical

有没有人做过类似的事情并愿意分享他们的工作示例?

Has anybody done something similar and is willing to share their working example?

谢谢!

推荐答案

我想通了.

something > on (team) group_left something_too_high_threshold

这使用基于团队"标签的加入"运算符来应用不同的指标和阈值.在此示例中,团队"标签必须存在于某事"指标以及不同的某事_too_high_threshold"表达式中.

This uses a "join" operator based on the "team" label to apply the different metric and threshold values. In this example, the "team" label must exist on the "something" metrics as well as the different "something_too_high_threshold" expressions.

这是模拟的规则.

    - record: something
      expr: 600
    - record: something
      expr: 400
      labels:
        team: elsewhere
    - record: something
      expr: 300
      labels:
        team: foo
    - record: something
      expr: 500
      labels:
        team: bar
    - record: something_too_high_threshold
      expr: 200
      labels:
        team: foo
    - record: something_too_high_threshold
      expr: 400
      labels:
        team: bar

这可以通过处理标签不存在的指标来改进.

This can be improved by handling metrics where the label doesn't exist.

something > on (team) group_left() ( something_too_high_threshold or on (team) something * 0 + 100 )

这使用与上面相同的模型规则,并提供不包含标签或未在 something_too_high_threshold 中定义的默认值something"指标

This uses the same mock up rules above and provides default values "something" metrics that either don’t include the label or is not defined in the something_too_high_threshold

这个网页,https://www.robustperception.io/using-time-series-as-alert-thresholds,帮我解决了这个问题.

This web page, https://www.robustperception.io/using-time-series-as-alert-thresholds, helped me figure this out.

这篇关于PromQL 查询中的动态阈值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆