节点无法加入Swarm群集 [英] Node cannot join Swarm Cluster

查看:524
本文介绍了节点无法加入Swarm群集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有3个虚拟机。他们都有docker 1.12,并且都在centos7上运行。
所有端口都打开,并且虚拟机能够ping通彼此
我用

I have 3 VM's. They all have docker 1.12 and they are running on centos7. All the ports are opened and the vm's are able to ping eachother I started my cluster with

docker swarm init --advertise-addr 192.168.140.12

Docker信息告诉我:

Docker info showed me:

Swarm: active
 NodeID: 0drcj2nku1mv8t16fxva48edxx
 Is Manager: true
 ClusterID: cchn0yzospwoe1h9f55d7omxx
 Managers: 1
 Nodes: 1

现在我尝试将节点(其他vm)加入集群。我使用启动管理员后推荐的命令。

Now I try to join nodes (other vms) to the cluster. I use the command which was recommended after starting my manager.

docker swarm join \
     --token SWMTKN-1-48ythur5k6ckkz90ttlprw37p9z3ldclws51qirw5wdyfmvevr-3sb2t66b2fj6e4dhmfo1vavxx \
     192.168.140.12:2377



But I got:

Error response from daemon: Timeout was reached before node was joined. Attempt to join the cluster will continue in the background. Use "docker info" command to see the current swarm status of your node.

Docker信息向我显示:

Docker info showed me:

Swarm: pending
 NodeID:
 Error: rpc error: code = 1 desc = context canceled
 Is Manager: false
 Node Address: 192.168.140.14

在群集管理器上:

# netstat -tulpn | grep docker
tcp6       0      0 :::2377                 :::*                    LISTEN      1602/dockerd
tcp6       0      0 :::7946                 :::*                    LISTEN      1602/dockerd
tcp6       0      0 :::8080                 :::*                    LISTEN      3398/docker-proxy
tcp6       0      0 :::32768                :::*                    LISTEN      3199/docker-proxy
tcp6       0      0 :::32769                :::*                    LISTEN      3219/docker-proxy
tcp6       0      0 :::32770                :::*                    LISTEN      3341/docker-proxy
tcp6       0      0 :::32771                :::*                    LISTEN      3436/docker-proxy
tcp6       0      0 :::2375                 :::*                    LISTEN      1602/dockerd
udp6       0      0 :::7946                 :::*                                1602/dockerd

我该如何调试此问题或忘记了p执行一些重要步骤?服务器之间是否需要ssh-access?谢谢

How can I debug this issue or did I forgot to perform some important step? Do the servers need ssh-access to each other? Thanks

在节点上的日志:

Aug  8 09:50:24 localhost dockerd: time="2016-08-08T09:50:24.393432145-04:00" level=error msg="Handler for POST /v1.24/swarm/leave returned error: This node is not part of swarm"
Aug  8 09:51:01 localhost su: (to root) worker1 on pts/1
Aug  8 09:51:34 localhost dockerd: time="2016-08-08T09:51:34.384408514-04:00" level=error msg="Handler for POST /v1.24/swarm/join returned error: Timeout was reached before node was joined. Attempt to join the cluster will continue in the background. Use \"docker info\" command to see the current swarm status of your node."
Aug  8 09:51:40 localhost su: (to root) worker1 on pts/1
Aug  8 09:52:47 localhost dhclient[1277]: DHCPREQUEST on eno16777736 to 192.168.140.254 port 67 (xid=0x11f8fba8)
Aug  8 09:52:47 localhost dhclient[1277]: DHCPACK from 192.168.140.254 (xid=0x11f8fba8)
Aug  8 09:52:47 localhost NetworkManager[953]: <info>    address 192.168.140.13
Aug  8 09:52:47 localhost NetworkManager[953]: <info>    plen 24 (255.255.255.0)
Aug  8 09:52:47 localhost NetworkManager[953]: <info>    gateway 192.168.140.2
Aug  8 09:52:47 localhost NetworkManager[953]: <info>    server identifier 192.168.140.254
Aug  8 09:52:47 localhost NetworkManager[953]: <info>    lease time 1800
Aug  8 09:52:47 localhost NetworkManager[953]: <info>    nameserver '192.168.140.2'
Aug  8 09:52:47 localhost NetworkManager[953]: <info>    domain name 'localdomain'
Aug  8 09:52:47 localhost NetworkManager[953]: <info>  (eno16777736): DHCPv4 state changed bound -> bound
Aug  8 09:52:47 localhost dbus[878]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Aug  8 09:52:47 localhost dbus-daemon: dbus[878]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Aug  8 09:52:47 localhost systemd: Starting Network Manager Script Dispatcher Service...
Aug  8 09:52:47 localhost dhclient[1277]: bound to 192.168.140.13 -- renewal in 713 seconds.
Aug  8 09:52:47 localhost dbus[878]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Aug  8 09:52:47 localhost dbus-daemon: dbus[878]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Aug  8 09:52:47 localhost nm-dispatcher: Dispatching action 'dhcp4-change' for eno16777736
Aug  8 09:52:47 localhost systemd: Started Network Manager Script Dispatcher Service.

有时会警告:

level=warning msg="failed to retrieve remote root CA certificate: rpc error: code = 1 desc = context canceled


推荐答案

也许您正在使用http代理。

Maybe you were using a http proxy.

您可以使用以下命令查看

You can use the following command to see what dockerd is doing.

# strace -Fp `pidof dockerd` 2>&1 |grep -v futex |grep -v epoll_wait |grep -v pselect

这篇关于节点无法加入Swarm群集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆