如何使用Prometheus Alert Manager在Kubernetes中触发警报 [英] How to trigger alert in Kubernetes using Prometheus Alert Manager

查看:446
本文介绍了如何使用Prometheus Alert Manager在Kubernetes中触发警报的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在集群中设置了kube-prometheus( https ://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus ).它包含一些默认警报,例如"CoreDNSdown等".如何创建我自己的警报?

I have setup kube-prometheus in my cluster(https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus). It contains some default alerts like "CoreDNSdown etc". How to create my own alert?

任何人都可以向我提供示例示例来创建警报,该警报会将电子邮件发送到我的gmail帐户吗?

Could any one provide me sample example to create an alert that will send an email to my gmail account?

我遵循了这个在Docker容器警告时pod处于Error或CarshLoopBackOff kubernetes 中.但是我无法使其正常工作.

I followed this Alert when docker container pod is in Error or CarshLoopBackOff kubernetes. But I couldn't make it work.

推荐答案

要将警报发送到您的gmail帐户,您需要在一个名为alertmanager.yaml的文件中设置alertmanager配置:

To send an alert to your gmail account, you need to setup the alertmanager configuration in a file say alertmanager.yaml:

cat <<EOF > alertmanager.yml
route:
  group_by: [Alertname]
  # Send all notifications to me.
  receiver: email-me

receivers:
- name: email-me
  email_configs:
  - to: $GMAIL_ACCOUNT
    from: $GMAIL_ACCOUNT
    smarthost: smtp.gmail.com:587
    auth_username: "$GMAIL_ACCOUNT"
    auth_identity: "$GMAIL_ACCOUNT"
    auth_password: "$GMAIL_AUTH_TOKEN"
EOF

现在,当您使用kube-prometheus时,您将拥有一个名为alertmanager-main的密码,这是alertmanager的默认配置.您需要使用以下命令使用新配置再次创建密钥alertmanager-main:

Now, as you're using kube-prometheus so you will have a secret named alertmanager-main that is default configuration for alertmanager. You need to create a secret alertmanager-main again with the new configuration using following command:

kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring

现在,您的Alertmanager设置为在收到来自Prometheus的警报时发送电子邮件.

Now you're alertmanager is set to send an email whenever it receive alert from the prometheus.

现在,您需要设置一个警报,邮件将在该警报上发送.您可以设置DeadManSwitch警报,该警报在每种情况下都会触发,并用于检查警报管道

Now you need to setup an alert on which your mail will be sent. You can set up DeadManSwitch alert which fires in every case and it is used to check your alerting pipeline

groups:
- name: meta
  rules:
    - alert: DeadMansSwitch
      expr: vector(1)
      labels:
        severity: critical
      annotations:
        description: This is a DeadMansSwitch meant to ensure that the entire Alerting
          pipeline is functional.
        summary: Alerting DeadMansSwitch

此后,将触发DeadManSwitch警报,并应将电子邮件发送到您的邮件中.

After that the DeadManSwitch alert will be fired and should send email to your mail.

参考链接:

https://coreos.com/tectonic/docs/latest/tectonic-prometheus-operator/user-guides/configuring-prometheus-alertmanager.html

deadmanswitch警报应放在您的普罗米修斯正在读取的配置映射中.我将在这里分享我的普罗米修斯的相关快照:

The deadmanswitch alert should go in a config-map which your prometheus is reading. I will share the relevant snaps from my prometheus here:

"spec": {
        "alerting": {
            "alertmanagers": [
                {
                    "name": "alertmanager-main",
                    "namespace": "monitoring",
                    "port": "web"
                }
            ]
        },
        "baseImage": "quay.io/prometheus/prometheus",
        "replicas": 2,
        "resources": {
            "requests": {
                "memory": "400Mi"
            }
        },
        "ruleSelector": {
            "matchLabels": {
                "prometheus": "prafull",
                "role": "alert-rules"
            }
        },

上面的配置是我的prometheus.json文件的名称,该文件具有要使用的alertmanager的名称,以及ruleSelector,它将基于prometheusrole标签选择规则.所以我的规则配置映射如下:

The above config is of my prometheus.json file which have the name of alertmanager to use and the ruleSelector which will select the rules based on prometheus and role label. So I have my rule configmap like:

kind: ConfigMap
apiVersion: v1
metadata:
  name: prometheus-rules
  namespace: monitoring
  labels:
    role: alert-rules
    prometheus: prafull
data:
  alert-rules.yaml: |+
   groups:
   - name: alerting_rules
     rules:
       - alert: LoadAverage15m
         expr: node_load15 >= 0.50
         labels:
           severity: major
         annotations:
           summary: "Instance {{ $labels.instance }} - high load average"
           description: "{{ $labels.instance  }} (measured by {{ $labels.job }}) has high load average ({{ $value }}) over 15 minutes."

在上面的配置图中替换DeadManSwitch.

Replace the DeadManSwitch in above config map.

这篇关于如何使用Prometheus Alert Manager在Kubernetes中触发警报的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆