无法将 Flink 指标公开给 Prometheus [英] Can't expose Flink metrics to Prometheus

查看:61
本文介绍了无法将 Flink 指标公开给 Prometheus的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将 Flink 的内置指标公开给 Prometheus,但不知何故 Prometheus 无法识别目标 - JMX 以及 PrometheusReporter.

I'm trying to expose the built-in metrics of Flink to Prometheus, but somehow Prometheus doesn't recognize the targets - both the JMX as well as the PrometheusReporter.

prometheus.yml 中定义的抓取如下所示:

The scraping defined in prometheus.yml looks like this:

scrape_configs:
  - job_name: node
    static_configs:
      - targets: ['localhost:9100']

  - job_name: 'kafka-server'
    static_configs:
      - targets: ['localhost:7071']

  - job_name: 'flink-jmx'
    static_configs:
      - targets: ['localhost:8789']

  - job_name: 'flink-prom'
    static_configs:
      - targets: ['localhost:9249']

我的 flink-conf.yml 有以下几行:

#metrics.reporters: jmx, prom
metrics.reporters: jmx, prometheus

#metrics.reporter.jmx.factory.class: org.apache.flink.metrics.jmx.JMXReporterFactory
metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter
metrics.reporter.jmx.port: 8789

metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9249

但是,在运行 WordCount

  • 在 IntelliJ 中
  • 作为jar:java -jar target/flink-word-count.jar --input src/main/resources/loremipsum.txt
  • 作为Flink作业:flink run target/flink-word-count.jar --input src/main/resources/loremipsum.txt

根据 Flink 文档,我不需要 JMX 的任何额外依赖项以及 flink/lib/中提供的 flink-metrics-prometheus-1.10.0.jar 的副本 用于 Prometheus 报告器.

According to the Flink docs I don't need any additional dependencies for JMX and a copy of the provided flink-metrics-prometheus-1.10.0.jar in flink/lib/ for the Prometheus reporter.

我做错了什么?缺少什么?

What am I doing wrong? What is missing?

推荐答案

我相信,这项特定的工作会很快完成.设置工作后,可能没有有趣的指标,因为作业运行的时间不够长,无法显示任何内容.

That particular job is going to run to completion pretty quickly, I believe. Once you get the setup working there may be no interesting metrics because the job doesn't run long enough for anything to show up.

当您使用迷你集群(如 java -jar ...)运行时,flink-conf.yaml 文件未加载(除非您在你的工作中做了一些相当特别的事情来加载它).另请注意,此文件通常具有 .yaml 扩展名;如果使用 .yml 代替,我不确定它是否有效.

When you run with a mini-cluster (as java -jar ...), the flink-conf.yaml file isn't loaded (unless you've done something rather special in your job to get it loaded). Note also that this file is normally has a .yaml extension; I'm not sure if it works if .yml is used instead.

您可以检查点动管理器和任务管理器日志以确保报告器正在加载.

You can check the jog manager and task manager logs to make sure that the reporters are being loaded.

FWIW,上次我这样做时我使用了这个设置,这样我就可以从多个进程中抓取:

FWIW, the last time I did this I used this setup, so that I could scrape from multiple processes:

# flink-conf.yaml

metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9250-9260

# prometheus.yml

scrape_configs:
  - job_name: 'flink'
    static_configs:
      - targets: ['localhost:9250', 'localhost:9251']

这篇关于无法将 Flink 指标公开给 Prometheus的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆