无法连接到Kubernetes的ETCD [英] Can't connect to the ETCD of Kubernetes

查看:127
本文介绍了无法连接到Kubernetes的ETCD的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无意中耗尽/取消了Kubernetes中的所有节点(甚至是主节点),现在我试图通过连接到ETCD并手动更改其中的一些密钥来将其恢复。我成功地撞到了etcd容器中:

I've accidentally drained/uncordoned all nodes in Kubernetes (even master) and now I'm trying to bring it back by connecting to the ETCD and manually change some keys in there. I successfuly bashed into etcd container:

$ docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS               NAMES
8fbcb67da963        quay.io/coreos/etcd:v3.3.10                "/usr/local/bin/etcd"    17 hours ago        Up 17 hours                             etcd1
a0d6426df02a        cd48205a40f0                               "kube-controller-man…"   17 hours ago        Up 17 hours                             k8s_kube-controller-manager_kube-controller-manager-node1_kube-system_0441d7804a7366fd957f8b402008efe5_16
5fa8e47441a0        6bed756ced73                               "kube-scheduler --au…"   17 hours ago        Up 17 hours                             k8s_kube-scheduler_kube-scheduler-node1_kube-system_6f33d7866b72ca1b13c79edd42fa8dc6_14
2c8e07cf499f        gcr.io/google_containers/pause-amd64:3.1   "/pause"                 17 hours ago        Up 17 hours                             k8s_POD_kube-scheduler-node1_kube-system_6f33d7866b72ca1b13c79edd42fa8dc6_3
2ca43282ea1c        gcr.io/google_containers/pause-amd64:3.1   "/pause"                 17 hours ago        Up 17 hours                             k8s_POD_kube-controller-manager-node1_kube-system_0441d7804a7366fd957f8b402008efe5_3
9473644a3333        gcr.io/google_containers/pause-amd64:3.1   "/pause"                 17 hours ago        Up 17 hours                             k8s_POD_kube-apiserver-node1_kube-system_93ff1a9840f77f8b2b924a85815e17fe_3

然后我运行:

docker exec -it 8fbcb67da963 /bin/sh

然后我尝试运行以下命令:

and then I try to run the following:

ETCDCTL_API=3 etcdctl --endpoints https://172.16.16.111:2379 --cacert /etc/ssl/etcd/ssl/ca.pem --key /etc/ssl/etcd/ssl/member-node1-key.pem --cert /etc/ssl/etcd/ssl/member-node1.pem get / --prefix=true -w json --debug

这是我得到的结果:

ETCDCTL_CACERT=/etc/ssl/etcd/ssl/ca.pem
ETCDCTL_CERT=/etc/ssl/etcd/ssl/member-node1.pem
ETCDCTL_COMMAND_TIMEOUT=5s
ETCDCTL_DEBUG=true
ETCDCTL_DIAL_TIMEOUT=2s
ETCDCTL_DISCOVERY_SRV=
ETCDCTL_ENDPOINTS=[https://172.16.16.111:2379]
ETCDCTL_HEX=false
ETCDCTL_INSECURE_DISCOVERY=true
ETCDCTL_INSECURE_SKIP_TLS_VERIFY=false
ETCDCTL_INSECURE_TRANSPORT=true
ETCDCTL_KEEPALIVE_TIME=2s
ETCDCTL_KEEPALIVE_TIMEOUT=6s
ETCDCTL_KEY=/etc/ssl/etcd/ssl/member-node1-key.pem
ETCDCTL_USER=
ETCDCTL_WRITE_OUT=json
INFO: 2020/06/24 15:44:07 ccBalancerWrapper: updating state and picker called by balancer: IDLE, 0xc420246c00
INFO: 2020/06/24 15:44:07 dialing to target with scheme: ""
INFO: 2020/06/24 15:44:07 could not get resolver for scheme: ""
INFO: 2020/06/24 15:44:07 balancerWrapper: is pickfirst: false
INFO: 2020/06/24 15:44:07 balancerWrapper: got update addr from Notify: [{172.16.16.111:2379 <nil>}]
INFO: 2020/06/24 15:44:07 ccBalancerWrapper: new subconn: [{172.16.16.111:2379 0  <nil>}]
INFO: 2020/06/24 15:44:07 balancerWrapper: handle subconn state change: 0xc4201708d0, CONNECTING
INFO: 2020/06/24 15:44:07 ccBalancerWrapper: updating state and picker called by balancer: CONNECTING, 0xc420246c00
Error: context deadline exceeded

这是我的etcd.env:

Here is my etcd.env:

# Environment file for etcd v3.3.10
ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://172.16.16.111:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://172.16.16.111:2380
ETCD_INITIAL_CLUSTER_STATE=existing
ETCD_METRICS=basic
ETCD_LISTEN_CLIENT_URLS=https://172.16.16.111:2379,https://127.0.0.1:2379
ETCD_ELECTION_TIMEOUT=5000
ETCD_HEARTBEAT_INTERVAL=250
ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd
ETCD_LISTEN_PEER_URLS=https://172.16.16.111:2380
ETCD_NAME=etcd1
ETCD_PROXY=off
ETCD_INITIAL_CLUSTER=etcd1=https://172.16.16.111:2380,etcd2=https://172.16.16.112:2380,etcd3=https://172.16.16.113:2380
ETCD_AUTO_COMPACTION_RETENTION=8
ETCD_SNAPSHOT_COUNT=10000

# TLS settings
ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_CERT_FILE=/etc/ssl/etcd/ssl/member-node1.pem
ETCD_KEY_FILE=/etc/ssl/etcd/ssl/member-node1-key.pem
ETCD_CLIENT_CERT_AUTH=true

ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/member-node1.pem
ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/member-node1-key.pem
ETCD_PEER_CLIENT_CERT_AUTH=True

更新1:

这是我的kubeadm-config.yaml:

Here is my kubeadm-config.yaml:

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 172.16.16.111
  bindPort: 6443
certificateKey: d73faece88f86e447eea3ca38f7b07e0a1f0bbb886567fee3b8cf8848b1bf8dd
nodeRegistration:
  name: node1
  taints: []
  criSocket: /var/run/dockershim.sock
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: cluster.local
etcd:
  external:
      endpoints:
      - https://172.16.16.111:2379
      - https://172.16.16.112:2379
      - https://172.16.16.113:2379
      caFile: /etc/ssl/etcd/ssl/ca.pem
      certFile: /etc/ssl/etcd/ssl/node-node1.pem
      keyFile: /etc/ssl/etcd/ssl/node-node1-key.pem
dns:
  type: CoreDNS
  imageRepository: docker.io/coredns
  imageTag: 1.6.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.233.0.0/18
  podSubnet: 10.233.64.0/18
kubernetesVersion: v1.16.6
controlPlaneEndpoint: 172.16.16.111:6443
certificatesDir: /etc/kubernetes/ssl
imageRepository: gcr.io/google-containers
apiServer:
  extraArgs:
    anonymous-auth: "True"
    authorization-mode: Node,RBAC
    bind-address: 0.0.0.0
    insecure-port: "0"
    apiserver-count: "1"
    endpoint-reconciler-type: lease
    service-node-port-range: 30000-32767
    kubelet-preferred-address-types: "InternalDNS,InternalIP,Hostname,ExternalDNS,ExternalIP"
    profiling: "False"
    request-timeout: "1m0s"
    enable-aggregator-routing: "False"
    storage-backend: etcd3
    runtime-config: 
    allow-privileged: "true"
  extraVolumes:
  - name: usr-share-ca-certificates
    hostPath: /usr/share/ca-certificates
    mountPath: /usr/share/ca-certificates
    readOnly: true
  certSANs:
  - kubernetes
  - kubernetes.default
  - kubernetes.default.svc
  - kubernetes.default.svc.cluster.local
  - 10.233.0.1
  - localhost
  - 127.0.0.1
  - node1
  - lb-apiserver.kubernetes.local
  - 172.16.16.111
  - node1.cluster.local
  timeoutForControlPlane: 5m0s
controllerManager:
  extraArgs:
    node-monitor-grace-period: 40s
    node-monitor-period: 5s
    pod-eviction-timeout: 5m0s
    node-cidr-mask-size: "24"
    profiling: "False"
    terminated-pod-gc-threshold: "12500"
    bind-address: 0.0.0.0
    configure-cloud-routes: "false"
scheduler:
  extraArgs:
    bind-address: 0.0.0.0
  extraVolumes:
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
bindAddress: 0.0.0.0
clientConnection:
 acceptContentTypes: 
 burst: 10
 contentType: application/vnd.kubernetes.protobuf
 kubeconfig: 
 qps: 5
clusterCIDR: 10.233.64.0/18
configSyncPeriod: 15m0s
conntrack:
 maxPerCore: 32768
 min: 131072
 tcpCloseWaitTimeout: 1h0m0s
 tcpEstablishedTimeout: 24h0m0s
enableProfiling: False
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: node1
iptables:
 masqueradeAll: False
 masqueradeBit: 14
 minSyncPeriod: 0s
 syncPeriod: 30s
ipvs:
 excludeCIDRs: []
 minSyncPeriod: 0s
 scheduler: rr
 syncPeriod: 30s
 strictARP: False
metricsBindAddress: 127.0.0.1:10249
mode: ipvs
nodePortAddresses: []
oomScoreAdj: -999
portRange: 
udpIdleTimeout: 250ms
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
clusterDNS:
- 169.254.25.10

更新2:

/etc/kubernetes/manigests/kube-apiserver.yaml:

Contents of /etc/kubernetes/manigests/kube-apiserver.yaml:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=172.16.16.111
    - --allow-privileged=true
    - --anonymous-auth=True
    - --apiserver-count=1
    - --authorization-mode=Node,RBAC
    - --bind-address=0.0.0.0
    - --client-ca-file=/etc/kubernetes/ssl/ca.crt
    - --enable-admission-plugins=NodeRestriction
    - --enable-aggregator-routing=False
    - --enable-bootstrap-token-auth=true
    - --endpoint-reconciler-type=lease
    - --etcd-cafile=/etc/ssl/etcd/ssl/ca.pem
    - --etcd-certfile=/etc/ssl/etcd/ssl/node-node1.pem
    - --etcd-keyfile=/etc/ssl/etcd/ssl/node-node1-key.pem
    - --etcd-servers=https://172.16.16.111:2379,https://172.16.16.112:2379,https://172.16.16.113:2379
    - --insecure-port=0
    - --kubelet-client-certificate=/etc/kubernetes/ssl/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/ssl/apiserver-kubelet-client.key
    - --kubelet-preferred-address-types=InternalDNS,InternalIP,Hostname,ExternalDNS,ExternalIP
    - --profiling=False
    - --proxy-client-cert-file=/etc/kubernetes/ssl/front-proxy-client.crt
    - --proxy-client-key-file=/etc/kubernetes/ssl/front-proxy-client.key
    - --request-timeout=1m0s
    - --requestheader-allowed-names=front-proxy-client
    - --requestheader-client-ca-file=/etc/kubernetes/ssl/front-proxy-ca.crt
    - --requestheader-extra-headers-prefix=X-Remote-Extra-
    - --requestheader-group-headers=X-Remote-Group
    - --requestheader-username-headers=X-Remote-User
    - --runtime-config=
    - --secure-port=6443
    - --service-account-key-file=/etc/kubernetes/ssl/sa.pub
    - --service-cluster-ip-range=10.233.0.0/18
    - --service-node-port-range=30000-32767
    - --storage-backend=etcd3
    - --tls-cert-file=/etc/kubernetes/ssl/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/ssl/apiserver.key
    image: gcr.io/google-containers/kube-apiserver:v1.16.6
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 172.16.16.111
        path: /healthz
        port: 6443
        scheme: HTTPS
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: kube-apiserver
    resources:
      requests:
        cpu: 250m
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/ca-certificates
      name: etc-ca-certificates
      readOnly: true
    - mountPath: /etc/ssl/etcd/ssl
      name: etcd-certs-0
      readOnly: true
    - mountPath: /etc/kubernetes/ssl
      name: k8s-certs
      readOnly: true
    - mountPath: /usr/local/share/ca-certificates
      name: usr-local-share-ca-certificates
      readOnly: true
    - mountPath: /usr/share/ca-certificates
      name: usr-share-ca-certificates
      readOnly: true
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/ca-certificates
      type: DirectoryOrCreate
    name: etc-ca-certificates
  - hostPath:
      path: /etc/ssl/etcd/ssl
      type: DirectoryOrCreate
    name: etcd-certs-0
  - hostPath:
      path: /etc/kubernetes/ssl
      type: DirectoryOrCreate
    name: k8s-certs
  - hostPath:
      path: /usr/local/share/ca-certificates
      type: DirectoryOrCreate
    name: usr-local-share-ca-certificates
  - hostPath:
      path: /usr/share/ca-certificates
      type: ""
    name: usr-share-ca-certificates
status: {}

我使用kubespray来安装群集。

I used kubespray to install the cluster.

如何连接到etcd?任何帮助将不胜感激。

How can I connect to the etcd? Any help would be appreciated.

推荐答案

通常会超过此上下文截止日期 c由于


  1. 使用错误的证书。您可能正在使用对等证书而不是客户端证书。您需要检查Kubernetes API Server参数,该参数会告诉您客户端证书在哪里,因为Kubernetes API Server是ETCD的客户端。然后,您可以在节点的 etcdctl 命令中使用这些相同的证书。

  1. Using wrong certificates. You could be using peer certificates instead of client certificates. You need to check the Kubernetes API Server parameters which will tell you where are the client certificates located because Kubernetes API Server is a client to ETCD. Then you can use those same certificates in the etcdctl command from the node.

etcd群集不再运行,因为对等成员已关闭。

The etcd cluster is not operational anymore because peer members are down.

这篇关于无法连接到Kubernetes的ETCD的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆