无法将 Flink 指标公开给 Prometheus [英] Can't expose Flink metrics to Prometheus
问题描述
我试图将 Flink 的内置指标公开给 Prometheus,但不知何故 Prometheus 无法识别目标 - JMX 以及 PrometheusReporter.
I'm trying to expose the built-in metrics of Flink to Prometheus, but somehow Prometheus doesn't recognize the targets - both the JMX as well as the PrometheusReporter.
prometheus.yml
中定义的抓取如下所示:
The scraping defined in prometheus.yml
looks like this:
scrape_configs:
- job_name: node
static_configs:
- targets: ['localhost:9100']
- job_name: 'kafka-server'
static_configs:
- targets: ['localhost:7071']
- job_name: 'flink-jmx'
static_configs:
- targets: ['localhost:8789']
- job_name: 'flink-prom'
static_configs:
- targets: ['localhost:9249']
我的 flink-conf.yml
有以下几行:
#metrics.reporters: jmx, prom
metrics.reporters: jmx, prometheus
#metrics.reporter.jmx.factory.class: org.apache.flink.metrics.jmx.JMXReporterFactory
metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter
metrics.reporter.jmx.port: 8789
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9249
但是,在运行 WordCount
- 在 IntelliJ 中
- 作为jar:
java -jar target/flink-word-count.jar --input src/main/resources/loremipsum.txt
- 作为Flink作业:
flink run target/flink-word-count.jar --input src/main/resources/loremipsum.txt
根据 Flink 文档,我不需要 JMX 的任何额外依赖项以及 flink/lib/中提供的
用于 Prometheus 报告器.flink-metrics-prometheus-1.10.0.jar
的副本
According to the Flink docs I don't need any additional dependencies for JMX and a copy of the provided flink-metrics-prometheus-1.10.0.jar
in flink/lib/
for the Prometheus reporter.
我做错了什么?缺少什么?
What am I doing wrong? What is missing?
推荐答案
我相信,这项特定的工作会很快完成.设置工作后,可能没有有趣的指标,因为作业运行的时间不够长,无法显示任何内容.
That particular job is going to run to completion pretty quickly, I believe. Once you get the setup working there may be no interesting metrics because the job doesn't run long enough for anything to show up.
当您使用迷你集群(如 java -jar ...
)运行时,flink-conf.yaml
文件未加载(除非您在你的工作中做了一些相当特别的事情来加载它).另请注意,此文件通常具有 .yaml
扩展名;如果使用 .yml
代替,我不确定它是否有效.
When you run with a mini-cluster (as java -jar ...
), the flink-conf.yaml
file isn't loaded (unless you've done something rather special in your job to get it loaded). Note also that this file is normally has a .yaml
extension; I'm not sure if it works if .yml
is used instead.
您可以检查点动管理器和任务管理器日志以确保报告器正在加载.
You can check the jog manager and task manager logs to make sure that the reporters are being loaded.
FWIW,上次我这样做时我使用了这个设置,这样我就可以从多个进程中抓取:
FWIW, the last time I did this I used this setup, so that I could scrape from multiple processes:
# flink-conf.yaml
metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9250-9260
# prometheus.yml
scrape_configs:
- job_name: 'flink'
static_configs:
- targets: ['localhost:9250', 'localhost:9251']
这篇关于无法将 Flink 指标公开给 Prometheus的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!