Kafka Connect 警报选项? [英] Kafka Connect Alerting Options?

查看:23
本文介绍了Kafka Connect 警报选项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于 Kafka Connect 连接器或连接器任务失败或遇到错误的场景,是否有任何警报选项?

Are there any alerting options for scenarios where a Kafka Connect Connector or a Connector task fails or experiences errors?

我们运行了 Kafka Connect,它运行良好,但我们遇到了需要手动跟踪和发现的错误.通常,在人们注意到问题之前,它已经处于错误状态一个星期.

We have Kafka Connect running, it runs well, but we've had errors that need to be manually traced and discovered. And often, it's been in an error state for a week before a human notices a problem.

推荐答案

自从这篇文章被撰写/回答后,Kafka Connect 开始提供自己的官方指标.Apache Kafka Connect 提供传统 JMX 格式的指标.

Since this post was written/answered, Kafka Connect began providing its own official metrics. The Apache Kafka Connect provides metrics in legacy JMX format.

如果您使用 Confluent Kafka Connect Helm Charts (https://github.com/confluentinc/cp-helm-charts/tree/master/charts/cp-kafka-connect),它们包括一个 Prometheus 指标导出器.

If you use the Confluent Kafka Connect Helm Charts (https://github.com/confluentinc/cp-helm-charts/tree/master/charts/cp-kafka-connect), they include a Prometheus metrics exporter.

我对 Confluent Helm Chart Prometheus 图表中的 cp_kafka_connect_connect_connector_metrics{status="running"} 进行监控和警报,但有很多变化.

I monitor and alert on cp_kafka_connect_connect_connector_metrics{status="running"} from the Confluent Helm Chart Prometheus chart, but there are many variations to that.

使用官方 Kafka Connect 指标通常更适合任何自动监控 + 警报设置.撰写此帖子 + 回答时,此选项不可用.

Using the official Kafka Connect metrics is generally preferable for any automated monitoring + alerting setup. This option wasn't available back when this post was written + answered.

仅供参考,Kafka 仍然没有公开滞后指标,因此您仍然需要第三方选项来监控滞后并发出警报.

FYI, Kafka still doesn't expose lag metrics, so you still need third party options to monitor and alert on lag.

这篇关于Kafka Connect 警报选项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆