Promethues警报规则中的动态标签值 [英] Dynamic label values in Promethues alerting rules

查看:209
本文介绍了Promethues警报规则中的动态标签值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对监控世界有点新手.这是我的问题.

I am a bit of a novice with the monitoring world. Here is my question.

我只想根据资产ID对一组资产发出警报.

I want to fire an alert only for a set of assets based on asset-id.

我的指标如下所示.

test_value{asset_id="123"} 0.215

我的警报管理器规则如下所示.

My alert manager rules looks like the below.

name: iot_rules
  rules:
  - alert: threshhold_alert
    expr: test_value >= 4
    #for: 1m
    labels:
      severity: critical      
      probableCause: Communication failure
    annotations:      
      summary: 'Error detected on {{$labels.assset_id}}'

我在注释上获得了模板功能.但是,我的promQL表达式不允许模板.基本上,我正在寻找编写如下的表达式.

I get the templating feature on the annotation. However, my promQL expression does not allow me to template. Basically, i am looking for writing a expression as below.

expr:test_value {asset_id = $ 1}> = 4.

expr: test_value{asset_id=$1} >= 4.

1美元的价值将来自其他地方(资产清单).

The value for $1 will come from elsewhere (list of assets).

这有可能吗?我不想通过为每个资产创建相同的规则来对表达式中的资产ID进行硬编码.基本上,资产ID在开发时是未知的,我不希望我的客户创建规则.

Is this a possibility? I don't want to hard code the asset-id in the expression and there by creating same rule for each asset. Basically the assets-id are unknown at the development time and I don't want my customer to create the rules.

推荐答案

PromQL本身不支持模板.不过,您确实有几种选择:

PromQL itself does not support templating. You do have a few choices of doing this, though:

  • 拥有您正在使用的任何部署工具(Ansible,Chef,Puppet),并使用正则表达式填充该$1,该正则表达式列出了您感兴趣的所有资产(并使用=~匹配器而不是=在您的PromQL表达式中).
  • 使用asset_id标签创建另一个指标(通过将其推送到Pushgateway或在单独的规则文件中定义),并在其中填充您感兴趣的所有资产ID,例如:

  • Have whatever deployment tool you're using (Ansible, Chef, Puppet) populate that $1 with a regular expression that lists all the assets you're interested in (and use the =~ matcher instead of = in your PromQL expression).
  • Create another metric (either by pushing it to a Pushgateway or by defining it in a separate rule file) with an asset_id label populated with all the asset IDs you're interested in, e.g.:

should_alert{asset_id="123"} 1
should_alert{asset_id="124"} 1
should_alert{asset_id="125"} 1

,然后将警报表达式定义为:

and then define your alert expression as:

expr: test_value >= 4 and on (asset_id) should_alert

这篇关于Promethues警报规则中的动态标签值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆