星火工人无法连接到主 [英] Spark worker can not connect to Master

查看:378
本文介绍了星火工人无法连接到主的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

虽然在开始工作节点我得到以下错误:

While starting the worker node I get the following error :

Spark Command: /usr/lib/jvm/default-java/bin/java -cp /home/ubuntu/spark-1.5.1-bin-hadoop2.6/sbin/../conf/:/home/ubuntu/spark-1.5.1-bin-hadoop2.6/lib/spark-assembly-1.5.1-hadoop2.6.0.jar:/home/ubuntu/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/home/ubuntu/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/home/ubuntu/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar -Xms1g -Xmx1g -XX:MaxPermSize=256m org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://ip-1-70-44-5:7077
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/10/16 19:19:10 INFO Worker: Registered signal handlers for [TERM, HUP, INT]
15/10/16 19:19:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/10/16 19:19:11 INFO SecurityManager: Changing view acls to: ubuntu
15/10/16 19:19:11 INFO SecurityManager: Changing modify acls to: ubuntu
15/10/16 19:19:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ubuntu); users with modify permissions: Set(ubuntu)
15/10/16 19:19:12 INFO Slf4jLogger: Slf4jLogger started
15/10/16 19:19:12 INFO Remoting: Starting remoting
15/10/16 19:19:12 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkWorker@1.70.44.4:55126]
15/10/16 19:19:12 INFO Utils: Successfully started service 'sparkWorker' on port 55126.
15/10/16 19:19:12 INFO Worker: Starting Spark worker 1.70.44.4:55126 with 2 cores, 2.9 GB RAM
15/10/16 19:19:12 INFO Worker: Running Spark version 1.5.1
15/10/16 19:19:12 INFO Worker: Spark home: /home/ubuntu/spark-1.5.1-bin-hadoop2.6
15/10/16 19:19:12 INFO Utils: Successfully started service 'WorkerUI' on port 8081.
15/10/16 19:19:12 INFO WorkerWebUI: Started WorkerWebUI at http://1.70.44.4:8081
15/10/16 19:19:12 INFO Worker: Connecting to master ip-1-70-44-5:7077...
15/10/16 19:19:24 INFO Worker: Retrying connection to master (attempt # 1)
15/10/16 19:19:24 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[sparkWorker-akka.actor.default-dispatcher-5,5,main]
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@1c5651e9 rejected from java.util.concurrent.ThreadPoolExecutor@671ba687[Running, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 0]
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
        at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1.apply(Worker.scala:211)
        at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1.apply(Worker.scala:210)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
        at org.apache.spark.deploy.worker.Worker.org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters(Worker.scala:210)
        at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$reregisterWithMaster$1.apply$mcV$sp(Worker.scala:288)
        at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119)
        at org.apache.spark.deploy.worker.Worker.org$apache$spark$deploy$worker$Worker$$reregisterWithMaster(Worker.scala:234)
        at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:521)
        at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177)
        at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126)
        at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197)
        at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
        at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
        at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
        at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
        at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
        at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
        at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
        at akka.actor.ActorCell.invoke(ActorCell.scala:487)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
        at akka.dispatch.Mailbox.run(Mailbox.scala:220)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
       at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/10/16 19:19:24 INFO ShutdownHookManager: Shutdown hook called

我添加主机名到的conf /奴隶文件。我不知道哪些环境变量spark-env.sh设置这样的权利不是它不被使用。

I have added the hostnames to the conf/slaves file. I dont know which enviroment variables to set in spark-env.sh so right not its not being used.

任何指针的解决方案?
另外,如果我要使用spark-env.sh那么这环境vvariables我应该跑?

Any pointers to the solution ? Also, if I should use spark-env.sh then which enviroment vvariables should I run ?

设置的详细信息:
2 ubuntu14机具有2个核,每个核。

setup details : 2 ubuntu14 machines having 2 cores each.

请指教。

感谢

推荐答案

于是,经过一番摆弄左右我发现,从无法给定的端口上与师父沟通。我改变了安全访问规则和启用了所有端口的TCP流量。这个问题解决了。

So, after some tinkering around I found that slave was not able to communicate with Master on the given port. I changed the security access rules and enabled all TCP traffic on all ports . This solved the problem.

要检查端口是开放的:

的telnet master.ip master.port

默认端口为7077。

我spark-env.sh:

My spark-env.sh :

出口SPARK_WORKER_INSTANCES = 2
出口SPARK_MASTER_IP =< IP地址>

这篇关于星火工人无法连接到主的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆