SPARK +独立集群:不能从另一台机器上启动工作 [英] SPARK + Standalone Cluster: Cannot start worker from another machine

查看:1110
本文介绍了SPARK +独立集群:不能从另一台机器上启动工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经建立一个独立的Spark集群设置的以下链接。我有2台机器;第一个(ubuntu0)作为主都和工人,而第二个(ubuntu1)就是一个工人。密码ssh已经正确配置了两台机器,并已通过人工手动操作的SSH两侧进行测试。

I've been setting up a Spark standalone cluster setup following this link. I have 2 machines; The first one (ubuntu0) serve as both the master and a worker, and the second one (ubuntu1) is just a worker. Password-less ssh has been properly configured for both machines already and was tested by doing SSH manually on both sides.

现在,当我试图./start-all.ssh,主机和工人师傅机(ubuntu0)上进行正常启动。 (:在我的部分8081本地主机)和(2)工人注册/在WebUI显示这是由于(1)WebUI中被访问表示。
然而,第二机(ubuntu1)上的其他工人,没有启动。显示的错误是:

Now when I tried to ./start-all.ssh, both master and worker on the master machine (ubuntu0) were started properly. This is signified by (1)WebUI being accessible (localhost:8081 on my part) and (2) Worker registered/displayed on the WebUI. However, the other worker on the second machine (ubuntu1), was not started. The error displayed was:

ubuntu1: ssh: connect to host ubuntu1 port 22: Connection timed out

现在这是很奇怪的已经给了,我已经正确配置了SSH是无密码的两侧。鉴于此,我访问的第二台机器,并试图使用这些命令手动启动工作:

Now this is quite weird already given that I've properly configured the ssh to be password-less on both sides. Given this, I accessed the second machine and tried to start the worker manually using these commands:

./spark-class org.apache.spark.deploy.worker.Worker spark://ubuntu0:7707
./spark-class org.apache.spark.deploy.worker.Worker spark://<ip>:7707

不过,下面的结果是:

However, below is the result:

14/05/23 13:49:08 INFO Utils: Using Spark's default log4j profile:    
                              org/apache/spark/log4j-defaults.properties
14/05/23 13:49:08 WARN Utils: Your hostname, ubuntu1 resolves to a loopback address:    
                        127.0.1.1; using 192.168.122.1 instead (on interface virbr0)
14/05/23 13:49:08 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
14/05/23 13:49:09 INFO Slf4jLogger: Slf4jLogger started
14/05/23 13:49:09 INFO Remoting: Starting remoting
14/05/23 13:49:09 INFO Remoting: Remoting started; listening on addresses :  
                                 [akka.tcp://sparkWorker@ubuntu1.local:42739]
14/05/23 13:49:09 INFO Worker: Starting Spark worker ubuntu1.local:42739 with 8 cores,  
                               4.8 GB RAM
14/05/23 13:49:09 INFO Worker: Spark home: /home/ubuntu1/jaysonp/spark/spark-0.9.1
14/05/23 13:49:09 INFO WorkerWebUI: Started Worker web UI at http://ubuntu1.local:8081
14/05/23 13:49:09 INFO Worker: Connecting to master spark://ubuntu0:7707...
14/05/23 13:49:29 INFO Worker: Connecting to master spark://ubuntu0:7707...
14/05/23 13:49:49 INFO Worker: Connecting to master spark://ubuntu0:7707...
14/05/23 13:50:09 ERROR Worker: All masters are unresponsive! Giving up.

下面是我的主从\\工人火花env.ssh的内容:

Below are the contents of my master and slave\worker spark-env.ssh:

SPARK_MASTER_IP=192.168.3.222
STANDALONE_SPARK_MASTER_HOST=`hostname -f`

我应该如何解决这个问题?在此先感谢!

How should I resolve this? Thanks in advance!

推荐答案

对于那些谁仍然遇到错误(S),当它涉及到开始在不同机器上的工人,我只是想分享的使用IP地址, CONF /奴隶为我工作。
希望这有助于!

For those who are still encountering error(s) when it comes to starting workers on different machines, I just want to share that using IP addresses in conf/slaves worked for me. Hope this helps!

这篇关于SPARK +独立集群:不能从另一台机器上启动工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆