Kafka Connect警报选项? [英] Kafka Connect Alerting Options?

查看:95
本文介绍了Kafka Connect警报选项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于Kafka Connect连接器或连接器任务失败或遇到错误的情况,是否有任何警报选项?

Are there any alerting options for scenarios where a Kafka Connect Connector or a Connector task fails or experiences errors?

我们正在运行Kafka Connect,它运行良好,但是我们有一些错误需要手动跟踪和发现.通常,在人们注意到问题之前,它一直处于错误状态一周.

We have Kafka Connect running, it runs well, but we've had errors that need to be manually traced and discovered. And often, it's been in an error state for a week before a human notices a problem.

推荐答案

自从撰写/回答了这篇文章以来,Kafka Connect开始提供自己的官方指标.Apache Kafka Connect以旧版JMX格式提供指标.

Since this post was written/answered, Kafka Connect began providing its own official metrics. The Apache Kafka Connect provides metrics in legacy JMX format.

如果您使用Confluent Kafka Connect舵表(

If you use the Confluent Kafka Connect Helm Charts (https://github.com/confluentinc/cp-helm-charts/tree/master/charts/cp-kafka-connect), they include a Prometheus metrics exporter.

我从Confluent Helm Chart Prometheus图表中的 cp_kafka_connect_connect_connect_connector_metrics {status ="running"} 进行监视和警报,但是有很多变化.

I monitor and alert on cp_kafka_connect_connect_connector_metrics{status="running"} from the Confluent Helm Chart Prometheus chart, but there are many variations to that.

通常,对于任何自动监视+警报设置,通常都首选使用官方的Kafka Connect指标.写这篇文章并回答时,此选项无法使用.

Using the official Kafka Connect metrics is generally preferable for any automated monitoring + alerting setup. This option wasn't available back when this post was written + answered.

仅供参考,Kafka仍然没有公开滞后指标,因此您仍然需要第三方选项来监视和提醒滞后.

FYI, Kafka still doesn't expose lag metrics, so you still need third party options to monitor and alert on lag.

这篇关于Kafka Connect警报选项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆