没有Pod指标的Kubernetes [英] Kubernetes without pod metrics

查看:101
本文介绍了没有Pod指标的Kubernetes的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将度量标准部署到kubernetes,并且确实发生了一些奇怪的事情,我只有一名工人和一名主人.我有以下豆荚列表:

I´m trying to deploy metrics to kubernetes and something really strange is happening, I have one worker and one master. I have the following pods list:

NAMESPACE     NAME                                              READY   STATUS    RESTARTS   AGE     IP               NODE                      NOMINATED NODE   READINESS GATES
default       php-apache-774ff9d754-d7vp9                       1/1     Running   0          2m43s   192.168.77.172   master-node               <none>           <none>
kube-system   calico-kube-controllers-6b9d4c8765-x7pql          1/1     Running   2          4h11m   192.168.77.130   master-node               <none>           <none>
kube-system   calico-node-d4rnh                                 0/1     Running   1          4h11m   10.221.194.166   master-node               <none>           <none>
kube-system   calico-node-hwkmd                                 0/1     Running   1          4h11m   10.221.195.58    free5gc-virtual-machine   <none>           <none>
kube-system   coredns-6955765f44-kf4dr                          1/1     Running   1          4h20m   192.168.178.65   free5gc-virtual-machine   <none>           <none>
kube-system   coredns-6955765f44-s58rf                          1/1     Running   1          4h20m   192.168.178.66   free5gc-virtual-machine   <none>           <none>
kube-system   etcd-free5gc-virtual-machine                      1/1     Running   1          4h21m   10.221.195.58    free5gc-virtual-machine   <none>           <none>
kube-system   kube-apiserver-free5gc-virtual-machine            1/1     Running   1          4h21m   10.221.195.58    free5gc-virtual-machine   <none>           <none>
kube-system   kube-controller-manager-free5gc-virtual-machine   1/1     Running   1          4h21m   10.221.195.58    free5gc-virtual-machine   <none>           <none>
kube-system   kube-proxy-brvdg                                  1/1     Running   1          4h19m   10.221.194.166   master-node               <none>           <none>
kube-system   kube-proxy-lfzjw                                  1/1     Running   1          4h20m   10.221.195.58    free5gc-virtual-machine   <none>           <none>
kube-system   kube-scheduler-free5gc-virtual-machine            1/1     Running   1          4h21m   10.221.195.58    free5gc-virtual-machine   <none>           <none>
kube-system   metrics-server-86c6d8b9bf-p2hh8                   1/1     Running   0          2m43s   192.168.77.171   master-node               <none>           <none>

当我尝试获取指标时,会看到以下内容:

When I try to get the metrics I see the following:

NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   <unknown>/50%   1         10        1          3m58s
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl top pods --all-namespaces
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

最后,我看到了日志(v = 6)metrics-server的输出:

Lastly, I see the log (v=6) the output of metrics-server:

free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl logs metrics-server-86c6d8b9bf-p2hh8  -n kube-system
I0206 18:16:18.657605       1 serving.go:273] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0206 18:16:19.367356       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 7 milliseconds
I0206 18:16:19.370573       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
I0206 18:16:19.373245       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
I0206 18:16:19.375024       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
[restful] 2020/02/06 18:16:19 log.go:33: [restful/swagger] listing is available at https://:4443/swaggerapi
[restful] 2020/02/06 18:16:19 log.go:33: [restful/swagger] https://:4443/swaggerui/ is mapped to folder /swagger-ui/
I0206 18:16:19.421207       1 healthz.go:83] Installing healthz checkers:"ping", "poststarthook/generic-apiserver-start-informers", "healthz"
I0206 18:16:19.421641       1 serve.go:96] Serving securely on [::]:4443
I0206 18:16:19.421873       1 reflector.go:202] Starting reflector *v1.Pod (0s) from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421891       1 reflector.go:240] Listing and watching *v1.Pod from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421914       1 reflector.go:202] Starting reflector *v1.Node (0s) from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421929       1 reflector.go:240] Listing and watching *v1.Node from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.423052       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0 200 OK in 1 milliseconds
I0206 18:16:19.424261       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0 200 OK in 2 milliseconds
I0206 18:16:19.425586       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/nodes?resourceVersion=38924&timeoutSeconds=481&watch=true 200 OK in 0 milliseconds
I0206 18:16:19.433545       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/pods?resourceVersion=39246&timeoutSeconds=582&watch=true 200 OK in 0 milliseconds
I0206 18:16:49.388514       1 manager.go:99] Beginning cycle, collecting metrics...
I0206 18:16:49.388598       1 manager.go:95] Scraping metrics from 2 sources
I0206 18:16:49.395742       1 manager.go:120] Querying source: kubelet_summary:free5gc-virtual-machine
I0206 18:16:49.400574       1 manager.go:120] Querying source: kubelet_summary:master-node
I0206 18:16:49.413751       1 round_trippers.go:405] GET https://10.221.194.166:10250/stats/summary/ 200 OK in 13 milliseconds
I0206 18:16:49.414317       1 round_trippers.go:405] GET https://10.221.195.58:10250/stats/summary/ 200 OK in 18 milliseconds
I0206 18:16:49.417044       1 manager.go:150] ScrapeMetrics: time: 28.428677ms, nodes: 2, pods: 13
I0206 18:16:49.417062       1 manager.go:115] ...Storing metrics...
I0206 18:16:49.417083       1 manager.go:126] ...Cycle complete
free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl logs metrics-server-86c6d8b9bf-p2hh8  -n kube-system
I0206 18:16:18.657605       1 serving.go:273] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0206 18:16:19.367356       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 7 milliseconds
I0206 18:16:19.370573       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
I0206 18:16:19.373245       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
I0206 18:16:19.375024       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
[restful] 2020/02/06 18:16:19 log.go:33: [restful/swagger] listing is available at https://:4443/swaggerapi
[restful] 2020/02/06 18:16:19 log.go:33: [restful/swagger] https://:4443/swaggerui/ is mapped to folder /swagger-ui/
I0206 18:16:19.421207       1 healthz.go:83] Installing healthz checkers:"ping", "poststarthook/generic-apiserver-start-informers", "healthz"
I0206 18:16:19.421641       1 serve.go:96] Serving securely on [::]:4443
I0206 18:16:19.421873       1 reflector.go:202] Starting reflector *v1.Pod (0s) from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421891       1 reflector.go:240] Listing and watching *v1.Pod from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421914       1 reflector.go:202] Starting reflector *v1.Node (0s) from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421929       1 reflector.go:240] Listing and watching *v1.Node from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.423052       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0 200 OK in 1 milliseconds
I0206 18:16:19.424261       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0 200 OK in 2 milliseconds
I0206 18:16:19.425586       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/nodes?resourceVersion=38924&timeoutSeconds=481&watch=true 200 OK in 0 milliseconds
I0206 18:16:19.433545       1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/pods?resourceVersion=39246&timeoutSeconds=582&watch=true 200 OK in 0 milliseconds
I0206 18:16:49.388514       1 manager.go:99] Beginning cycle, collecting metrics...
I0206 18:16:49.388598       1 manager.go:95] Scraping metrics from 2 sources
I0206 18:16:49.395742       1 manager.go:120] Querying source: kubelet_summary:free5gc-virtual-machine
I0206 18:16:49.400574       1 manager.go:120] Querying source: kubelet_summary:master-node
I0206 18:16:49.413751       1 round_trippers.go:405] GET https://10.221.194.166:10250/stats/summary/ 200 OK in 13 milliseconds
I0206 18:16:49.414317       1 round_trippers.go:405] GET https://10.221.195.58:10250/stats/summary/ 200 OK in 18 milliseconds
I0206 18:16:49.417044       1 manager.go:150] ScrapeMetrics: time: 28.428677ms, nodes: 2, pods: 13
I0206 18:16:49.417062       1 manager.go:115] ...Storing metrics...
I0206 18:16:49.417083       1 manager.go:126] ...Cycle complete

使用v = 10的日志输出,我什至可以看到每个Pod的运行状况详细信息,但是在运行kubectl get hpakubectl top nodes时什么也看不到.有人可以给我提示吗?此外,我的指标清单是:

Using the log output with v=10 I can see even the details of health of each pod, but nothing while running the kubectl get hpa or kubectl top nodes. Can someone give me a hint? Furthermore, my metrics manifest is:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metrics-server
  namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
  labels:
    k8s-app: metrics-server
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  template:
    metadata:
      name: metrics-server
      labels:
        k8s-app: metrics-server
    spec:
      serviceAccountName: metrics-server
      volumes:
      # mount in tmp so we can safely use from-scratch images and/or read-only containers
      - name: tmp-dir
        emptyDir: {}
      containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.1
        args:
          - /metrics-server
          - --metric-resolution=30s
          - --requestheader-allowed-names=aggregator
          - --cert-dir=/tmp
          - --secure-port=4443
          - --kubelet-insecure-tls
          - --v=6
          - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
            #- --kubelet-preferred-address-types=InternalIP
        ports:
        - name: main-port
          containerPort: 4443
          protocol: TCP
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        imagePullPolicy: Always
        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp
      nodeSelector:
        beta.kubernetes.io/os: linux
        kubernetes.io/arch: "amd64"

我可以看到以下内容:

free5gc@free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  creationTimestamp: "2020-02-06T18:57:28Z"
  name: v1beta1.metrics.k8s.io
  resourceVersion: "45583"
  selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
  uid: ca439221-b987-4c13-b0e0-8d2bb237e612
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
    port: 443
  version: v1beta1
  versionPriority: 100
status:
  conditions:
  - lastTransitionTime: "2020-02-06T18:57:28Z"
    message: 'failing or missing response from https://10.110.144.114:443/apis/metrics.k8s.io/v1beta1:
      Get https://10.110.144.114:443/apis/metrics.k8s.io/v1beta1: dial tcp 10.110.144.114:443:
      connect: no route to host'
    reason: FailedDiscoveryCheck
    status: "False"
    type: Available

推荐答案

我已转载您的问题(在Google Compute Engine上).尝试了几种方案来找到解决此问题的方法/解决方案.

I have reproduced your issue (on Google Compute Engine). Tried a few scenarios to find workaround/solution for this issue.

我要说的第一件事是您提供了ServiceAccountDeployment YAML.您还需要ClusterRoleBindingRoleBindingApiService等.所有需要的YAML都可以在

First thing I want to mention is that you have provided ServiceAccount and Deployment YAML. You also need ClusterRoleBinding, RoleBinding, ApiService, etc. All needed YAMLs can be found in this Github repo.

要使用所有必需的配置快速部署metrics-server,您可以使用:

For fast deploy metrics-server with all required config you can use:

$ git clone https://github.com/kubernetes-sigs/metrics-server.git
$ cd metrics-server/deploy/
$ kubectl apply -f kubernetes/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created

第二件事,我建议您检查CNI吊舱(calico-node-d4rnhcalico-node-hawked).创建了4h11m,但Ready 0/1.

The second thing I would advise you to check your CNI pods (calico-node-d4rnh and calico-node-hawked). Created 4h11m but Ready 0/1.

关于从容器和节点收集CPU和内存数据的最后一件事.

Last thing regarding gathering CPU and Memory data from pods and nodes.

使用印花棉布

如果使用单节点kubeadm,它将正常工作,但是,当您在kubeadm中使用多个节点时,这将引起一些问题. Github上有许多与此类似的线程.我已经尝试过在args:中使用各种标志,但是没有成功.在metrics-server日志(-v=6)中,您将看到指标正在收集.在此Github线程中,Github的用户之一发布答案,这是解决此问题的方法.在有关hostNetwork的K8s文档.

If you are using one-node kubeadm, it will work correctly, however, when you are using more than 1 node in kubeadm, this will cause some issues. There are many similar threads on Github regarding this. I've tried with various flags in args:, but no success. In metrics-server logs (-v=6) you will be able to see that metrics are gathering. In this Github thread, one of the Github users posted answer which is a workaround for this issue. It's also mentioned in K8s docs about hostNetwork.

添加hostNetwork: true最终使metrics-server为我工作.没有它,娜达.没有kubelet-preferred-address-types line,我可以查询我的主节点,但不能查询我的两个工作节点,也不能查询pod,这显然是不希望的结果.缺少kubelet-insecure-tls也会导致无法操作的metrics-server安装.

Adding hostNetwork: true is what finally got metrics-server working for me. Without it, nada. Without the kubelet-preferred-address-types line, I could query my master node but not my two worker nodes, nor could I query pods, obviously undesirable results. Lack of kubelet-insecure-tls also results in an inoperable metrics-server installation.

spec:
  hostNetwork: true
  containers:
  - args:
    - --kubelet-insecure-tls
    - --cert-dir=/tmp
    - --secure-port=4443
    - --kubelet-preferred-address-types=InternalIP
    - --v=6
    image: k8s.gcr.io/metrics-server-amd64:v0.3.6
    imagePullPolicy: Always

如果您将使用此配置进行部署,则它将起作用.

If you will deploy with this config it will work.

$ kubectl describe apiservice v1beta1.metrics.k8s.io
Name:         v1beta1.metrics.k8s.io
...
Status:
  Conditions:
    Last Transition Time:  2020-02-20T09:37:59Z
    Message:               all checks passed
    Reason:                Passed
    Status:                True
    Type:                  Available
Events:                    <none>

此外,当您检查iptables时,可以看到使用host network: true的区别.与没有此配置的部署相比,有更多条目.

In addition, you can see the difference when using host network: true when you will check iptables. There is much more entries compare to deployment without this config.

之后,您可以编辑部署,并删除或注释host network: true.

After that, you can edit deployment, and remove or comment host network: true.

$ kubectl edit deploy metrics-server -n kube-system
deployment.apps/metrics-server edited

$ kubectl top pods
NAME                     CPU(cores)   MEMORY(bytes)   
nginx-6db489d4b7-2qhzw   0m           3Mi             
nginx-6db489d4b7-9fvrj   0m           2Mi             
nginx-6db489d4b7-dgbf9   0m           2Mi             
nginx-6db489d4b7-dvcz5   0m           2Mi   

此外,您还可以使用以下方法找到指标:

Also, you will be able to find metrics using:

$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes

为获得更好的可见性,您还可以使用jq.

For better visibility you can use also jq.

$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods | jq .

使用编织网

当您使用编织网时,它将代替Calico而无需设置host network.

When you will use Weave Net and instead of Calico it will work without setting host network.

$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

但是,您将需要使用certificates.但是,如果您不关心安全性,则可以像使用上一个示例中的Calico一样使用--kubelet-insecure-tls.

However, you will need to work with certificates. But if you don't care about security, you can just use --kubelet-insecure-tls like in the previous example, when Calico was used.

这篇关于没有Pod指标的Kubernetes的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆