运输端点未连接 - Mesos从站/主站 [英] Transport Endpoint Not Connected - Mesos Slave / Master

查看:2732
本文介绍了运输端点未连接 - Mesos从站/主站的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将Mesos从站连接到主站。当奴隶尝试连接到主人时,我会收到以下消息:

I'm trying to connect a Mesos slave to its master. Whenver the slave tries to connect to the master, I get the following message:

I0806 16:39:59.090845   935 hierarchical.hpp:528] Added slave 20150806-163941-1027506442-5050-921-S3 (debian) with cpus(*):1; mem(*):1938; disk(*):3777; ports(*):[31000-32000] (allocated: )
E0806 16:39:59.091384   940 socket.hpp:107] Shutdown failed on fd=25: Transport endpoint is not connected [107]
I0806 16:39:59.091508   940 master.cpp:3395] Registered slave 20150806-163941-1027506442-5050-921-S3 at slave(1)@127.0.1.1:5051 (debian) with cpus(*):1; mem(*):1938; disk(*):3777; ports(*):[31000-32000]
I0806 16:39:59.091747   940 master.cpp:1006] Slave 20150806-163941-1027506442-5050-921-S3 at slave(1)@127.0.1.1:5051 (debian) disconnected
I0806 16:39:59.091868   940 master.cpp:2203] Disconnecting slave 20150806-163941-1027506442-5050-921-S3 at slave(1)@127.0.1.1:5051 (debian)
I0806 16:39:59.092031   940 master.cpp:2222] Deactivating slave 20150806-163941-1027506442-5050-921-S3 at slave(1)@127.0.1.1:5051 (debian)
I0806 16:39:59.092248   939 hierarchical.hpp:621] Slave 20150806-163941-1027506442-5050-921-S3 deactivated

错误似乎是:

E0806 16:39:59.091384 940 socket.hpp:107] fd = 25关闭失败:传输端点未连接[107]

E0806 16:39:59.091384 940 socket.hpp:107] Shutdown failed on fd=25: Transport endpoint is not connected [107]

主机已启动:

./mesos-master.sh --ip=10.129.62.61 --work_dir=~/Mesos/mesos-0.23.0/workdir/ --zk=zk://10.129.62.61:2181/mesos --quorum=1

和奴隶

./mesos-slave.sh --master=zk://10.129.62.61:2181/mesos

如果我在与主机相同的VM上运行从属设备,那么它的工作正常。

If I run the slave on the same VM as the host it's working fine.

我在互联网上找不到很多信息。我正在VirtualBox 5上运行两个虚拟框(Debian 8.1)。主机是Windows 7。

I couldn't find much information on the internet. I'm running two virtual boxes (Debian 8.1) on VirtualBox 5. The host is a windows 7.

编辑1:

Edit 1:

主机和从机都运行在专用的VM上。

The master and the slave both run on a dedicated VM.

两个虚拟机下一代都使用桥接网络配置。

Both VMs nextorks are configured using bridged network.

ifconfig from master:

ifconfig from master:

eth0      Link encap:Ethernet  HWaddr 08:00:27:cc:6c:6e
          inet addr:10.129.62.61  Bcast:10.129.255.255  Mask:255.255.0.0
          inet6 addr: fe80::a00:27ff:fecc:6c6e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5335953 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1422428 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:595886271 (568.2 MiB)  TX bytes:362423868 (345.6 MiB)

从slave的ifconfig:

ifconfig from slave:

eth0      Link encap:Ethernet  HWaddr 08:00:27:56:83:20
          inet addr:10.129.62.49  Bcast:10.129.255.255  Mask:255.255.0.0
          inet6 addr: fe80::a00:27ff:fe56:8320/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:4358561 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3825 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:397126834 (378.7 MiB)  TX bytes:354116 (345.8 KiB)

编辑2:

Edit 2:

从日志可以在 http://pastebin.com/CXZUBHKr

可以找到主日志 http://pastebin.com/thYR1par

推荐答案

我也有类似的问题。
我的从属日志将填充

I had a similar problem. My slave logs would be filled with

    E0812 15:58:04.017990  2193 socket.hpp:107] Shutdown failed on fd=13: Transport endpoint is not connected [107]

我的主人会有

    F0120 20:45:48.025610 12116 master.cpp:1083] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins

主人会死,新的选举将会发生,杀死的主人将被upstart重新启动(我是在Centos 6盒子上),并添加到潜在的主人池中。因此,我当选的主人将菊花链连接在我的主节点上。大师和奴隶的许多重新启动没有什么问题会在主选举的1分钟内一直返回。

And the master would die, and a new election would occur, the killed master would be restarted by upstart (I am on a Centos 6 box) and be added into the pool of potential masters. Thus my elected master would daisy chain around my master nodes. Many restarts of masters and slaves did nothing the problem would consistently return within 1 minute of master election.

我的解决方案来自于一个这个stackoverflow问题(谢谢)在github中提示 gist note

The solution for me came from a this stackoverflow question (thanks) and a hint in a github gist note.

它的要点是 / etc / default / mesos-master 必须指定一个仲裁号码(在我的情况下,它需要对于mesos主机的数量是正确的3)

The gist of it is /etc/default/mesos-master must specify a quorum number (it needs to be correct for the number of mesos masters, in my case 3)

    MESOS_QUORUM=2

这对我来说似乎是奇怪的,因为我在文件中有相同的信息 / etc / mesos-master / quorum

This seems odd to me as I have the same information in the file /etc/mesos-master/quorum

但是我将它添加到 / etc / default / mesos-master 重新启动了mesos-master和slave,并且没有返回问题。

But I added it to /etc/default/mesos-master restarted the mesos-masters and slaves and the problem has not returned.

我希望这有助于您。

这篇关于运输端点未连接 - Mesos从站/主站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆