普罗米修斯:无法从连接的Kubernetes集群导出指标 [英] Prometheus: cannot export metrics from connected Kubernetes cluster

查看:17
本文介绍了普罗米修斯:无法从连接的Kubernetes集群导出指标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:我在Kubernetes集群外有一台普罗米修斯。因此,我希望从远程群集中导出指标。

我从Prometheus Github repo中获取了配置示例,并对其进行了一些修改。因此,这是我的工作配置。

  - job_name: 'kubernetes-apiservers'

    scheme: http

    kubernetes_sd_configs:
    - role: endpoints
      api_server: http://cluster-manager.dev.example.net:8080

    bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
    tls_config:
      insecure_skip_verify: true

    relabel_configs:
    - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: default;kubernetes;http

  - job_name: 'kubernetes-nodes'

    scheme: http

    kubernetes_sd_configs:
    - role: node
      api_server: http://cluster-manager.dev.example.net:8080

    bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
    tls_config:
      insecure_skip_verify: true

    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)

  - job_name: 'kubernetes-service-endpoints'

    scheme: http

    kubernetes_sd_configs:
    - role: endpoints
      api_server: http://cluster-manager.dev.example.net:8080

    bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
    tls_config:
      insecure_skip_verify: true

    relabel_configs:
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
      action: replace
      target_label: __scheme__
      regex: (http?)
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
      action: replace
      target_label: __metrics_path__
      regex: (.+)
    - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
      action: replace
      target_label: __address__
      regex: (.+)(?::d+);(d+)
      replacement: $1:$2
    - action: labelmap
      regex: __meta_kubernetes_service_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: kubernetes_namespace
    - source_labels: [__meta_kubernetes_service_name]
      action: replace
      target_label: kubernetes_name

  - job_name: 'kubernetes-services'

    scheme: http

    metrics_path: /probe
    params:
      module: [http_2xx]

    kubernetes_sd_configs:
    - role: service
      api_server: http://cluster-manager.dev.example.net:8080

    bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
    tls_config:
      insecure_skip_verify: true

    relabel_configs:
    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
      action: keep
      regex: true
    - source_labels: [__address__]
      target_label: __param_target
    - target_label: __address__
      replacement: blackbox
    - source_labels: [__param_target]
      target_label: instance
    - action: labelmap
      regex: __meta_kubernetes_service_label_(.+)
    - source_labels: [__meta_kubernetes_service_namespace]
      target_label: kubernetes_namespace
    - source_labels: [__meta_kubernetes_service_name]
      target_label: kubernetes_name

  - job_name: 'kubernetes-pods'

    scheme: http

    kubernetes_sd_configs:
    - role: pod
      api_server: http://cluster-manager.dev.example.net:8080

    bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
    tls_config:
      insecure_skip_verify: true

    relabel_configs:
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
      action: replace
      target_label: __metrics_path__
      regex: (.+)
    - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      regex: (.+):(?:d+);(d+)
      replacement: ${1}:${2}
      target_label: __address__
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: kubernetes_namespace
    - source_labels: [__meta_kubernetes_pod_name]
      action: replace
      target_label: kubernetes_pod_name

我没有使用到API的TLS连接,所以我想禁用它。

当我从普罗米修斯主机卷曲/metricsURL时,它会打印它们。

最终我连接到群集,但是...作业未运行,因此普罗米修斯不公开重新标记的指标。

我在控制台中看到的内容。

目标状态:

我还检查了普罗米修斯的调试。认为系统会获取任何必要的信息,并且请求已成功完成。

time="2017-01-25T06:58:04Z" level=debug msg="pod update" kubernetes_sd=pod source="pod.go:66" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{"__meta_kubernetes_pod_container_port_protocol":"UDP", "__address__":"10.32.0.2:10053", "__meta_kubernetes_pod_container_name":"kube-dns", "__meta_kubernetes_pod_container_port_number":"10053", "__meta_kubernetes_pod_container_port_name":"dns-local"}, model.LabelSet{"__address__":"10.32.0.2:10053", "__meta_kubernetes_pod_container_name":"kube-dns", "__meta_kubernetes_pod_container_port_number":"10053", "__meta_kubernetes_pod_container_port_name":"dns-tcp-local", "__meta_kubernetes_pod_container_port_protocol":"TCP"}, model.LabelSet{"__meta_kubernetes_pod_container_name":"kube-dns", "__meta_kubernetes_pod_container_port_number":"10055", "__meta_kubernetes_pod_container_port_name":"metrics", "__meta_kubernetes_pod_container_port_protocol":"TCP", "__address__":"10.32.0.2:10055"}, model.LabelSet{"__address__":"10.32.0.2:53", "__meta_kubernetes_pod_container_name":"dnsmasq", "__meta_kubernetes_pod_container_port_number":"53", "__meta_kubernetes_pod_container_port_name":"dns", "__meta_kubernetes_pod_container_port_protocol":"UDP"}, model.LabelSet{"__address__":"10.32.0.2:53", "__meta_kubernetes_pod_container_name":"dnsmasq", "__meta_kubernetes_pod_container_port_number":"53", "__meta_kubernetes_pod_container_port_name":"dns-tcp", "__meta_kubernetes_pod_container_port_protocol":"TCP"}, model.LabelSet{"__meta_kubernetes_pod_container_port_number":"10054", "__meta_kubernetes_pod_container_port_name":"metrics", "__meta_kubernetes_pod_container_port_protocol":"TCP", "__address__":"10.32.0.2:10054", "__meta_kubernetes_pod_container_name":"dnsmasq-metrics"}, model.LabelSet{"__meta_kubernetes_pod_container_port_protocol":"TCP", "__address__":"10.32.0.2:8080", "__meta_kubernetes_pod_container_name":"healthz", "__meta_kubernetes_pod_container_port_number":"8080", "__meta_kubernetes_pod_container_port_name":""}}, Labels:model.LabelSet{"__meta_kubernetes_pod_ready":"true", "__meta_kubernetes_pod_annotation_kubernetes_io_created_by":"{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"kube-system\",\"name\":\"kube-dns-2924299975\",\"uid\":\"fa808d95-d7d9-11e6-9ac9-02dfdae1a1e9\",\"apiVersion\":\"extensions\",\"resourceVersion\":\"89\"}}\n", "__meta_kubernetes_pod_annotation_scheduler_alpha_kubernetes_io_affinity":"{\"nodeAffinity\":{\"requiredDuringSchedulingIgnoredDuringExecution\":{\"nodeSelectorTerms\":[{\"matchExpressions\":[{\"key\":\"beta.kubernetes.io/arch\",\"operator\":\"In\",\"values\":[\"amd64\"]}]}]}}}", "__meta_kubernetes_pod_name":"kube-dns-2924299975-dksg5", "__meta_kubernetes_pod_ip":"10.32.0.2", "__meta_kubernetes_pod_label_k8s_app":"kube-dns", "__meta_kubernetes_pod_label_pod_template_hash":"2924299975", "__meta_kubernetes_pod_label_tier":"node", "__meta_kubernetes_pod_annotation_scheduler_alpha_kubernetes_io_tolerations":"[{\"key\":\"dedicated\",\"value\":\"master\",\"effect\":\"NoSchedule\"}]", "__meta_kubernetes_namespace":"kube-system", "__meta_kubernetes_pod_node_name":"cluster-manager.dev.example.net", "__meta_kubernetes_pod_label_component":"kube-dns", "__meta_kubernetes_pod_label_kubernetes_io_cluster_service":"true", "__meta_kubernetes_pod_host_ip":"54.194.166.39", "__meta_kubernetes_pod_label_name":"kube-dns"}, Source:"pod/kube-system/kube-dns-2924299975-dksg5"}" 
time="2017-01-25T06:58:04Z" level=debug msg="pod update" kubernetes_sd=pod source="pod.go:66" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{"__address__":"10.43.0.0", "__meta_kubernetes_pod_container_name":"bot"}}, Labels:model.LabelSet{"__meta_kubernetes_pod_host_ip":"172.17.101.25", "__meta_kubernetes_pod_label_app":"bot", "__meta_kubernetes_namespace":"default", "__meta_kubernetes_pod_name":"bot-272181271-pnzsz", "__meta_kubernetes_pod_ip":"10.43.0.0", "__meta_kubernetes_pod_node_name":"ip-172-17-101-25", "__meta_kubernetes_pod_annotation_kubernetes_io_created_by":"{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"default\",\"name\":\"bot-272181271\",\"uid\":\"c297b3c2-e15d-11e6-a28a-02dfdae1a1e9\",\"apiVersion\":\"extensions\",\"resourceVersion\":\"1465127\"}}\n", "__meta_kubernetes_pod_ready":"true", "__meta_kubernetes_pod_label_pod_template_hash":"272181271", "__meta_kubernetes_pod_label_version":"v0.1"}, Source:"pod/default/bot-272181271-pnzsz"}" 

普罗米修斯获取更新,但...不将它们转换为指标。 所以,我绞尽脑汁想弄清楚为什么会这样。所以,请帮帮忙,如果你能找出哪里可能是错误的。

推荐答案

如果要从外部普罗米修斯服务器监视Kubernetes集群,我建议设置Prometheus federation拓扑:

  • 在K8内部,安装节点导出器Pod和具有短期存储的Prometheus实例。
  • 通过入口控制器(LB)或节点端口从K8S集群公开Prometheus服务。您可以使用HTTPS+基本身份验证保护此终结点。
  • 将中心Prometheus配置为使用正确的身份验证和标记从上述终结点抓取度量值。

这是可伸缩的解决方案。您可以添加任意数量的监视器K8集群,直到它达到中心普罗米修斯的容量。然后,您可以添加另一个中心普罗米修斯实例来监视其他实例。

这篇关于普罗米修斯:无法从连接的Kubernetes集群导出指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆