docker swarm重新标记实例中的Prometheus DNS服务发现 [英] Prometheus dns service discovery in docker swarm relabel instance

查看:146
本文介绍了docker swarm重新标记实例中的Prometheus DNS服务发现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是在Docker群中进行Prometheus dns服务发现的补充

我将普罗米修斯刮擦目标定义如下:

I define the prometheus scrape targets as follows:

- job_name: 'node-exporter'
  dns_sd_configs:
  - names:
    - 'tasks.nodeexporter'
    type: 'A'
    port: 9100

这可以很好地工作,但是使用docker容器的IP作为实例标签会导致Prometheus。

This works fine but results in prometheus using the IP of the docker container as instance label.

我尝试如下重新标记实例标签:

I tried to relabel the instance label as follows:

relabel_configs:
- source_labels: [__meta_dns_name]
  target_label: instance

但是这样做会导致具有相同标签 tasks.nodeexporter的node-exporter。

But doing so results in all instances of node-exporter having the same label "tasks.nodeexporter".

是否可以通过某种方式将实例标签重新标记为tasks.nodexporter_1,tasks.nodeexporter_2,...?

Is it somehow possible to relabel the instance label to something like tasks.nodexporter_1, tasks.nodeexporter_2, ...?

推荐答案

在Prometheus中,对docker swarm设置的服务发现没有很好的支持,因为swarm方面缺少许多功能。

Service discovery for docker swarm setups isn’t supported well in Prometheus as there are many features missing on the swarm side.

dns服务发现是缓解这些缺失功能的一种方法,但我认为这不是一个好的解决方案,我建议不要在生产中使用它。

The dns service discovery is one way to mitigate those missing features but in my opinion it isn’t a good solution and I recommend to not use it in production:


  • 无法提供其他信息,例如使用SRV记录

  • 没有关于应该运行多少实例的信息

  • ,因为dns仅列出了健康的任务,即刮擦的数量当一项任务不再健康时,目标会降低,这使得在容器行为异常时更难以发出警报

  • 在容器死亡并重新启动时,您将观察到新实例,因为没有诸如任务槽之类的信息可用

  • there is no way to provide additional information e.g. using SRV records
  • there is no information about how many instances should be running
  • as the the dns only lists the healthy tasks, the amount of scrape targets decreases when one task is no longer considered healthy which makes it harder to alert on misbehaving containers
  • when containers die and are restarted you will observe new instances as there is no information like the task slot available

总而言之,这些问题使该方法无法成为监视系统的可靠来源。

Altogether these issues don’t allow this approach for being a reliable source for a monitoring system.

如果您真的想使用docker swarm,则应考虑通过编程方式查询docker api并使用Prometheus的file_sd服务发现机制来构建更可持续的解决方案。请通过containersolutions查看此poc以供参考: https://github.com/ContainerSolutions/prometheus-swarm -发现

If you are really tied to using docker swarm, you should consider building a more sustainable solution by querying the docker api programmatically and using the file_sd service discovery mechanism of Prometheus. Please see this poc by containersolutions for reference: https://github.com/ContainerSolutions/prometheus-swarm-discovery

这篇关于docker swarm重新标记实例中的Prometheus DNS服务发现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆