Kubernetes api 服务器未在单个 kubeadm 集群上启动 [英] Kubernetes api server is not starting on a single kubeadm cluster

查看:46
本文介绍了Kubernetes api 服务器未在单个 kubeadm 集群上启动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试建立一个裸机 k8s 集群.

I'm trying to set up a bare-metal k8s cluster.

在创建集群时,使用 flannel 插件 (sudo kubeadm init --pod-network-cidr=10.244.0.0/16) - 似乎 API 服务器甚至没有运行:

When creating the cluster, using flannel plugin (sudo kubeadm init --pod-network-cidr=10.244.0.0/16) - it seems that the API server doesn't even run:

root@kubernetes-master:/# kubectl cluster-info
Kubernetes master is running at https://192.168.10.164:6443

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The connection to the server 192.168.10.164:6443 was refused - did you specify the right host or port?

我已禁用交换,这就是我在日志中的内容:

i've disabled swap, and that's what i have in the logs:

Oct 09 11:45:50 kubernetes-master kubelet[12442]: E1009 11:45:50.975944   12442 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "kubernetes-master": Get https://192.168.10.164:6443/api/v1/nodes/kubernetes-master?resourceVersion=0&timeout=10s: dial tcp 192.168.10.164:6443: connect: connection refused
Oct 09 11:45:50 kubernetes-master kubelet[12442]: E1009 11:45:50.976715   12442 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "kubernetes-master": Get https://192.168.10.164:6443/api/v1/nodes/kubernetes-master?timeout=10s: dial tcp 192.168.10.164:6443: connect: connection refused
Oct 09 11:45:50 kubernetes-master kubelet[12442]: E1009 11:45:50.977162   12442 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "kubernetes-master": Get https://192.168.10.164:6443/api/v1/nodes/kubernetes-master?timeout=10s: dial tcp 192.168.10.164:6443: connect: connection refused
Oct 09 11:45:50 kubernetes-master kubelet[12442]: E1009 11:45:50.977741   12442 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "kubernetes-master": Get https://192.168.10.164:6443/api/v1/nodes/kubernetes-master?timeout=10s: dial tcp 192.168.10.164:6443: connect: connection refused
Oct 09 11:45:50 kubernetes-master kubelet[12442]: E1009 11:45:50.978199   12442 kubelet_node_status.go:391] Error updating node status, will retry: error getting node "kubernetes-master": Get https://192.168.10.164:6443/api/v1/nodes/kubernetes-master?timeout=10s: dial tcp 192.168.10.164:6443: connect: connection refused

当我执行 docker ps 时,我看到 api-server 甚至没有启动:

when i do docker ps, i see that the api-server did not even start:

root@kubernetes-master:/# docker ps -a
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS                      PORTS               NAMES
7904888d512d        ca1f38854f74           "kube-scheduler --ad…"   15 minutes ago      Up 15 minutes                                   k8s_kube-scheduler_kube-scheduler-kubernetes-master_kube-system_009228e74aef4d7babd7968782118d5e_1
ad5f25be44a3        ca1f38854f74           "kube-scheduler --ad…"   16 minutes ago      Exited (1) 16 minutes ago                       k8s_kube-scheduler_kube-scheduler-kubernetes-master_kube-system_009228e74aef4d7babd7968782118d5e_0
1948a59f8ec9        b8df3b177be2           "etcd --advertise-cl…"   16 minutes ago      Up 16 minutes                                   k8s_etcd_etcd-kubernetes-master_kube-system_2c12104e97be3063569dbbc535d06f35_0
a43f9cb2a143        k8s.gcr.io/pause:3.1   "/pause"                 16 minutes ago      Up 16 minutes                                   k8s_POD_kube-scheduler-kubernetes-master_kube-system_009228e74aef4d7babd7968782118d5e_0
c0125fd3aa06        k8s.gcr.io/pause:3.1   "/pause"                 16 minutes ago      Up 16 minutes                                   k8s_POD_etcd-kubernetes-master_kube-system_2c12104e97be3063569dbbc535d06f35_0

我当然也无法配置网络插件,因为 API 服务器已关闭:

I'm also not able of course to configure the network plugin because the API server is down:

root@kubernetes-master:/# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": Get https://192.168.10.164:6443/api?timeout=32s: dial tcp 192.168.10.164:6443: connect: connection refused

我不确定如何继续调试这个,帮助会有所帮助.

I'm not sure how to continue to debug this, Assistance would be helpful.

推荐答案

是的,API 服务器肯定有问题.我给你的建议是擦除所有内容,将 docker.iokubeletkubeadmkubectl 更新到最新版本并从头开始.

Yes, you definitely have problems with API server. My advice to you is wipe all, update docker.io, kubelet, kubeadm, kubectl to latest versions and start from scratch.

让我逐步帮助您:

擦除你当前的集群,更新根目录下的包:

Wipe you current cluster, update packages under the root :

#kubeadm reset -f && rm -rf /etc/kubernetes/
#apt-get update && apt-get install -y mc ebtables ethtool docker.io apt-transport-https curl
#curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
#cat <<EOF >/etc/apt/sources.list.d/kubernetes.list 
 deb http://apt.kubernetes.io/ kubernetes-xenial main 
 EOF
#apt-get update && apt-get install -y kubelet kubeadm kubectl

确保kubelet使用的cgroup驱动与Docker使用的相同.验证您的 Docker cgroup 驱动程序是否与 kubelet 配置匹配:

Make sure that the cgroup driver used by kubelet is the same as the one used by Docker. Verify that your Docker cgroup driver matches the kubelet config:

#docker info | grep -i cgroup
#cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

检查版本:

root@kube-master-1:~# docker -v
Docker version 17.03.2-ce, build f5ec1e2
root@kube-master-1:~# kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:46:06Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?
root@kube-master-1:~# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:43:08Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
root@kube-master-1:~# kubelet --version
Kubernetes v1.12.1

启动集群:

#kubeadm init --pod-network-cidr=10.244.0.0/16

以普通用户身份登录并运行以下命令:

Login and run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
  source <(kubectl completion bash) # setup autocomplete in bash into the current shell, bash-completion package should be installed first.
  echo "source <(kubectl completion bash)" >> ~/.bashrc # add autocomplete permanently to your bash shell.

检查集群:

$ kubectl cluster-info
Kubernetes master is running at https://10.132.0.2:6443
KubeDNS is running at https://10.132.0.2:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

$ kubectl get no -o wide
NAME            STATUS     ROLES    AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION    CONTAINER-RUNTIME
kube-master-1   NotReady   master   4m26s   v1.12.1   10.132.0.2    <none>        Ubuntu 16.04.5 LTS   4.15.0-1021-gcp   docker://17.3.2

$ kubectl get all --all-namespaces 
NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE
kube-system   pod/coredns-576cbf47c7-lw7jv                0/1     Pending   0          4m55s
kube-system   pod/coredns-576cbf47c7-ncx8w                0/1     Pending   0          4m55s
kube-system   pod/etcd-kube-master-1                      1/1     Running   0          4m23s
kube-system   pod/kube-apiserver-kube-master-1            1/1     Running   0          3m59s
kube-system   pod/kube-controller-manager-kube-master-1   1/1     Running   0          4m17s
kube-system   pod/kube-proxy-bwrwh                        1/1     Running   0          4m55s
kube-system   pod/kube-scheduler-kube-master-1            1/1     Running   0          4m10s

NAMESPACE     NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
default       service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP         5m15s
kube-system   service/kube-dns     ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP   5m9s

NAMESPACE     NAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
kube-system   daemonset.apps/kube-proxy   1         1         1       1            1           <none>          5m8s

NAMESPACE     NAME                      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/coredns   2         2         2            0           5m9s

NAMESPACE     NAME                                 DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/coredns-576cbf47c7   2         2         0       4m56s

安装 CNI(我更喜欢 印花布):

Install CNI (I prefer Calico):

$ kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created


$ kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
configmap/calico-config created
service/calico-typha created
deployment.apps/calico-typha created
daemonset.extensions/calico-node created
serviceaccount/calico-node created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created

检查结果:

$ kubectl get no -o wide
NAME            STATUS   ROLES    AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION    CONTAINER-RUNTIME
kube-master-1   Ready    master   9m15s   v1.12.1   10.132.0.2    <none>        Ubuntu 16.04.5 LTS   4.15.0-1021-gcp   docker://17.3.2



$ kubectl get all --all-namespaces 
NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE
kube-system   pod/calico-node-tsstf                       2/2     Running   0          2m3s
kube-system   pod/coredns-576cbf47c7-lw7jv                1/1     Running   0          9m20s
kube-system   pod/coredns-576cbf47c7-ncx8w                1/1     Running   0          9m20s
kube-system   pod/etcd-kube-master-1                      1/1     Running   0          8m48s
kube-system   pod/kube-apiserver-kube-master-1            1/1     Running   0          8m24s
kube-system   pod/kube-controller-manager-kube-master-1   1/1     Running   0          8m42s
kube-system   pod/kube-proxy-bwrwh                        1/1     Running   0          9m20s
kube-system   pod/kube-scheduler-kube-master-1            1/1     Running   0          8m35s

NAMESPACE     NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
default       service/kubernetes     ClusterIP   10.96.0.1       <none>        443/TCP         9m40s
kube-system   service/calico-typha   ClusterIP   10.105.62.183   <none>        5473/TCP        2m4s
kube-system   service/kube-dns       ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP   9m34s

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE
kube-system   daemonset.apps/calico-node   1         1         1       1            1           beta.kubernetes.io/os=linux   2m4s
kube-system   daemonset.apps/kube-proxy    1         1         1       1            1           <none>                        9m33s

NAMESPACE     NAME                           DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/calico-typha   0         0         0            0           2m4s
kube-system   deployment.apps/coredns        2         2         2            2           9m34s

NAMESPACE     NAME                                      DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/calico-typha-5f646c475c   0         0         0       2m4s
kube-system   replicaset.apps/coredns-576cbf47c7        2         2         2       9m21s

$ sudo docker ps -a | grep api
996cf65268fe        dcb029b5e3ad                                                                                  "kube-apiserver --..."   10 minutes ago      Up 10 minutes                           k8s_kube-apiserver_kube-apiserver-kube-master-1_kube-system_371bd9e2260dc98257ab7a6961e293b0_0
ab9f0949b295        k8s.gcr.io/pause:3.1                                                                          "/pause"                 10 minutes ago      Up 10 minutes                           k8s_POD_kube-apiserver-kube-master-1_kube-system_371bd9e2260dc98257ab7a6961e293b0_0

希望对你有帮助.

这篇关于Kubernetes api 服务器未在单个 kubeadm 集群上启动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆