普罗米修斯:无法从连接的Kubernetes集群导出指标 [英] Prometheus: cannot export metrics from connected Kubernetes cluster
本文介绍了普罗米修斯:无法从连接的Kubernetes集群导出指标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
问题:我在Kubernetes集群外有一台普罗米修斯。因此,我希望从远程群集中导出指标。
我从Prometheus Github repo中获取了配置示例,并对其进行了一些修改。因此,这是我的工作配置。
- job_name: 'kubernetes-apiservers'
scheme: http
kubernetes_sd_configs:
- role: endpoints
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;http
- job_name: 'kubernetes-nodes'
scheme: http
kubernetes_sd_configs:
- role: node
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-service-endpoints'
scheme: http
kubernetes_sd_configs:
- role: endpoints
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (http?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)(?::d+);(d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- job_name: 'kubernetes-services'
scheme: http
metrics_path: /probe
params:
module: [http_2xx]
kubernetes_sd_configs:
- role: service
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
action: keep
regex: true
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: blackbox
- source_labels: [__param_target]
target_label: instance
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_service_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
target_label: kubernetes_name
- job_name: 'kubernetes-pods'
scheme: http
kubernetes_sd_configs:
- role: pod
api_server: http://cluster-manager.dev.example.net:8080
bearer_token_file: /opt/prometheus/prometheus/kube_tokens/dev
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: (.+):(?:d+);(d+)
replacement: ${1}:${2}
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
我没有使用到API的TLS连接,所以我想禁用它。
当我从普罗米修斯主机卷曲/metrics
URL时,它会打印它们。
最终我连接到群集,但是...作业未运行,因此普罗米修斯不公开重新标记的指标。
我在控制台中看到的内容。
目标状态:
我还检查了普罗米修斯的调试。认为系统会获取任何必要的信息,并且请求已成功完成。
time="2017-01-25T06:58:04Z" level=debug msg="pod update" kubernetes_sd=pod source="pod.go:66" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{"__meta_kubernetes_pod_container_port_protocol":"UDP", "__address__":"10.32.0.2:10053", "__meta_kubernetes_pod_container_name":"kube-dns", "__meta_kubernetes_pod_container_port_number":"10053", "__meta_kubernetes_pod_container_port_name":"dns-local"}, model.LabelSet{"__address__":"10.32.0.2:10053", "__meta_kubernetes_pod_container_name":"kube-dns", "__meta_kubernetes_pod_container_port_number":"10053", "__meta_kubernetes_pod_container_port_name":"dns-tcp-local", "__meta_kubernetes_pod_container_port_protocol":"TCP"}, model.LabelSet{"__meta_kubernetes_pod_container_name":"kube-dns", "__meta_kubernetes_pod_container_port_number":"10055", "__meta_kubernetes_pod_container_port_name":"metrics", "__meta_kubernetes_pod_container_port_protocol":"TCP", "__address__":"10.32.0.2:10055"}, model.LabelSet{"__address__":"10.32.0.2:53", "__meta_kubernetes_pod_container_name":"dnsmasq", "__meta_kubernetes_pod_container_port_number":"53", "__meta_kubernetes_pod_container_port_name":"dns", "__meta_kubernetes_pod_container_port_protocol":"UDP"}, model.LabelSet{"__address__":"10.32.0.2:53", "__meta_kubernetes_pod_container_name":"dnsmasq", "__meta_kubernetes_pod_container_port_number":"53", "__meta_kubernetes_pod_container_port_name":"dns-tcp", "__meta_kubernetes_pod_container_port_protocol":"TCP"}, model.LabelSet{"__meta_kubernetes_pod_container_port_number":"10054", "__meta_kubernetes_pod_container_port_name":"metrics", "__meta_kubernetes_pod_container_port_protocol":"TCP", "__address__":"10.32.0.2:10054", "__meta_kubernetes_pod_container_name":"dnsmasq-metrics"}, model.LabelSet{"__meta_kubernetes_pod_container_port_protocol":"TCP", "__address__":"10.32.0.2:8080", "__meta_kubernetes_pod_container_name":"healthz", "__meta_kubernetes_pod_container_port_number":"8080", "__meta_kubernetes_pod_container_port_name":""}}, Labels:model.LabelSet{"__meta_kubernetes_pod_ready":"true", "__meta_kubernetes_pod_annotation_kubernetes_io_created_by":"{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"kube-system\",\"name\":\"kube-dns-2924299975\",\"uid\":\"fa808d95-d7d9-11e6-9ac9-02dfdae1a1e9\",\"apiVersion\":\"extensions\",\"resourceVersion\":\"89\"}}\n", "__meta_kubernetes_pod_annotation_scheduler_alpha_kubernetes_io_affinity":"{\"nodeAffinity\":{\"requiredDuringSchedulingIgnoredDuringExecution\":{\"nodeSelectorTerms\":[{\"matchExpressions\":[{\"key\":\"beta.kubernetes.io/arch\",\"operator\":\"In\",\"values\":[\"amd64\"]}]}]}}}", "__meta_kubernetes_pod_name":"kube-dns-2924299975-dksg5", "__meta_kubernetes_pod_ip":"10.32.0.2", "__meta_kubernetes_pod_label_k8s_app":"kube-dns", "__meta_kubernetes_pod_label_pod_template_hash":"2924299975", "__meta_kubernetes_pod_label_tier":"node", "__meta_kubernetes_pod_annotation_scheduler_alpha_kubernetes_io_tolerations":"[{\"key\":\"dedicated\",\"value\":\"master\",\"effect\":\"NoSchedule\"}]", "__meta_kubernetes_namespace":"kube-system", "__meta_kubernetes_pod_node_name":"cluster-manager.dev.example.net", "__meta_kubernetes_pod_label_component":"kube-dns", "__meta_kubernetes_pod_label_kubernetes_io_cluster_service":"true", "__meta_kubernetes_pod_host_ip":"54.194.166.39", "__meta_kubernetes_pod_label_name":"kube-dns"}, Source:"pod/kube-system/kube-dns-2924299975-dksg5"}"
time="2017-01-25T06:58:04Z" level=debug msg="pod update" kubernetes_sd=pod source="pod.go:66" tg="&config.TargetGroup{Targets:[]model.LabelSet{model.LabelSet{"__address__":"10.43.0.0", "__meta_kubernetes_pod_container_name":"bot"}}, Labels:model.LabelSet{"__meta_kubernetes_pod_host_ip":"172.17.101.25", "__meta_kubernetes_pod_label_app":"bot", "__meta_kubernetes_namespace":"default", "__meta_kubernetes_pod_name":"bot-272181271-pnzsz", "__meta_kubernetes_pod_ip":"10.43.0.0", "__meta_kubernetes_pod_node_name":"ip-172-17-101-25", "__meta_kubernetes_pod_annotation_kubernetes_io_created_by":"{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"default\",\"name\":\"bot-272181271\",\"uid\":\"c297b3c2-e15d-11e6-a28a-02dfdae1a1e9\",\"apiVersion\":\"extensions\",\"resourceVersion\":\"1465127\"}}\n", "__meta_kubernetes_pod_ready":"true", "__meta_kubernetes_pod_label_pod_template_hash":"272181271", "__meta_kubernetes_pod_label_version":"v0.1"}, Source:"pod/default/bot-272181271-pnzsz"}"
普罗米修斯获取更新,但...不将它们转换为指标。 所以,我绞尽脑汁想弄清楚为什么会这样。所以,请帮帮忙,如果你能找出哪里可能是错误的。
推荐答案
如果要从外部普罗米修斯服务器监视Kubernetes集群,我建议设置Prometheus federation拓扑:
- 在K8内部,安装节点导出器Pod和具有短期存储的Prometheus实例。
- 通过入口控制器(LB)或节点端口从K8S集群公开Prometheus服务。您可以使用HTTPS+基本身份验证保护此终结点。
- 将中心Prometheus配置为使用正确的身份验证和标记从上述终结点抓取度量值。
这是可伸缩的解决方案。您可以添加任意数量的监视器K8集群,直到它达到中心普罗米修斯的容量。然后,您可以添加另一个中心普罗米修斯实例来监视其他实例。
这篇关于普罗米修斯:无法从连接的Kubernetes集群导出指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文