无法将pyspark连接到主机 [英] Can't connect pyspark to master

查看:124
本文介绍了无法将pyspark连接到主机的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经设置了一个三节点的Spark集群,该集群也用作hadoop集群.
master/worker1也是namenode/datanode1
worker2也是datanode2
worker3也是datanode3

I have setup a three-node spark cluster that is also used as a hadoop cluster.
master/worker1 is also namenode/datanode1
worker2 is also datanode2
worker3 is also datanode3

节点是具有专用IP地址的VM,但我也为它们创建了一个静态IP地址.

The nodes are VMs with private ip addresses but i also created a static ip address for them.

私有ip:192.168.0.4-静态ip:x.x.x.117
专用ip:192.168.0.7-静态ip:x.x.x.118
专用ip:192.168.0.2-静态ip:x.x.x.120

private ip: 192.168.0.4 - static ip: x.x.x.117
private ip: 192.168.0.7 - static ip: x.x.x.118
private ip: 192.168.0.2 - static ip: x.x.x.120

Hadoop版本为hadoop-2.6.3
Spark版本是spark-1.5.2-bin-hadoop2.6
Java版本是1.7.0_79

当我使用命令行时:
$ MASTER = spark://x.x.x.117:7077 pyspark --master yarn-client

它没有给出任何错误,并且在屏幕上显示了所有详细消息之后-我最终会收到pyspark提示,并且可以运行spark作业.它只是在本地运行.
另外,当我检查Spark WebUI: http://xxx117:8080 时,pyspark应用程序不会显示在页面的正在运行的应用程序"部分.我怀疑pyspark shell并未真正在集群模式下运行.

所以我尝试了以下命令:
$ MASTER = spark://x.x.x.117:7077 pyspark

上面的命令在控制台上提供了以下消息:

Hadoop Version is hadoop-2.6.3
Spark version is spark-1.5.2-bin-hadoop2.6
Java version is 1.7.0_79

When I used the command-line:
$ MASTER=spark://x.x.x.117:7077 pyspark --master yarn-client

it does not give any errors and after all the verbose messages it displays on screen - i would eventually get the pyspark prompt and I can run spark job. It just it's only running locally.
Also when I check the Spark WebUI: http://x.x.x.117:8080, the pyspark application does not show up under the "Running Applications" section of the page. I suspect that the pyspark shell is not really running in cluster mode.

So I tried this following command:
$ MASTER=spark://x.x.x.117:7077 pyspark

The above command gives these messages on the console:


Python 2.7.5 (default, Jun 24 2015, 00:41:19)
[GCC 4.8.3 20140911 (Red Hat 4.8.3-9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
16/01/06 20:14:39 INFO spark.SparkContext: Running Spark version 1.5.2
16/01/06 20:14:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/01/06 20:14:40 INFO spark.SecurityManager: Changing view acls to: centos
16/01/06 20:14:40 INFO spark.SecurityManager: Changing modify acls to: centos
16/01/06 20:14:40 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(centos); users with modify permissions: Set(centos)
16/01/06 20:14:41 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/01/06 20:14:41 INFO Remoting: Starting remoting
16/01/06 20:14:41 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.0.4:51079]
16/01/06 20:14:41 INFO util.Utils: Successfully started service 'sparkDriver' on port 51079.
16/01/06 20:14:41 INFO spark.SparkEnv: Registering MapOutputTracker
16/01/06 20:14:41 INFO spark.SparkEnv: Registering BlockManagerMaster
16/01/06 20:14:41 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-0d984b69-ad9c-4ced-ae65-ffd3bc1c79f5
16/01/06 20:14:41 INFO storage.MemoryStore: MemoryStore started with capacity 2.6 GB
16/01/06 20:14:41 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-f3b76c61-812b-4413-b86e-f42c1399d548/httpd-3a2c827d-4d6c-4d2e-8625-db27851c143d
16/01/06 20:14:41 INFO spark.HttpServer: Starting HTTP Server
16/01/06 20:14:41 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/01/06 20:14:41 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:42438
16/01/06 20:14:41 INFO util.Utils: Successfully started service 'HTTP file server' on port 42438.
16/01/06 20:14:41 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/01/06 20:14:41 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/01/06 20:14:41 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
16/01/06 20:14:41 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
16/01/06 20:14:41 INFO ui.SparkUI: Started SparkUI at http://192.168.0.4:4040
16/01/06 20:14:41 WARN metrics.MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
16/01/06 20:14:42 INFO client.AppClient$ClientEndpoint: Connecting to master spark://x.x.x.117:7077...
16/01/06 20:15:02 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@2d65e4c4 rejected from java.util.concurrent.ThreadPoolExecutor@7c8e1724[Running, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 0]
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
        at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:96)
        at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:95)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
        at org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:95)
        at org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:121)
        at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:132)
        at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119)
        at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:124)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
16/01/06 20:15:02 INFO storage.DiskBlockManager: Shutdown hook called
16/01/06 20:15:02 INFO util.ShutdownHookManager: Shutdown hook called
16/01/06 20:15:02 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-f3b76c61-812b-4413-b86e-f42c1399d548
Traceback (most recent call last):
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/pyspark/shell.py", line 43, in 
    sc = SparkContext(pyFiles=add_files)
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.py", line 113, in __init__
    conf, jsc, profiler_cls)
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.py", line 170, in _do_init
    self._jsc = jsc or self._initialize_context(self._conf._jconf)
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.py", line 224, in _initialize_context
    return self._jvm.JavaSparkContext(jconf)
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 699, in __call__
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 369, in send_command
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 362, in send_command
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 318, in _get_connection
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 325, in _create_connection
  File "/opt/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 432, in start
py4j.protocol.Py4JNetworkError: An error occurred while trying to connect to the Java server
>>>

查看为主服务器生成的日志文件:

Looking at the log files generated for the master:


16/01/06 19:32:30 INFO master.Master: Registered signal handlers for [TERM, HUP, INT]
16/01/06 19:32:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classe
s where applicable
16/01/06 19:32:31 INFO spark.SecurityManager: Changing view acls to: root
16/01/06 19:32:31 INFO spark.SecurityManager: Changing modify acls to: root
16/01/06 19:32:31 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permiss
ions: Set(root); users with modify permissions: Set(root)
16/01/06 19:32:31 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/01/06 19:32:31 INFO Remoting: Starting remoting
16/01/06 19:32:32 INFO util.Utils: Successfully started service 'sparkMaster' on port 7077.
16/01/06 19:32:32 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkMaster@cassandra-spark-1:7077]
16/01/06 19:32:32 INFO master.Master: Starting Spark master at spark://cassandra-spark-1:7077
16/01/06 19:32:32 INFO master.Master: Running Spark version 1.5.2
16/01/06 19:32:32 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/01/06 19:32:33 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:8080
16/01/06 19:32:33 INFO util.Utils: Successfully started service 'MasterUI' on port 8080.
16/01/06 19:32:33 INFO ui.MasterWebUI: Started MasterWebUI at http://192.168.0.4:8080
16/01/06 19:32:33 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/01/06 19:32:33 INFO server.AbstractConnector: Started SelectChannelConnector@cassandra-spark-1:6066
16/01/06 19:32:33 INFO util.Utils: Successfully started service on port 6066.
16/01/06 19:32:33 INFO rest.StandaloneRestServer: Started REST server for submitting applications on port 6066
16/01/06 19:32:33 INFO master.Master: I have been elected leader! New state: ALIVE
16/01/06 19:32:35 INFO master.Master: Registering worker 192.168.0.7:52930 with 2 cores, 6.6 GB RAM
16/01/06 19:32:35 INFO master.Master: Registering worker 192.168.0.2:48119 with 2 cores, 6.6 GB RAM
16/01/06 19:32:35 INFO master.Master: Registering worker 192.168.0.4:56830 with 2 cores, 6.6 GB RAM
16/01/06 19:33:32 ERROR akka.ErrorMonitor: dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Act
or[akka.tcp://sparkMaster@x.x.x.x:7077/]] arriving at [akka.tcp://sparkMaster@x.x.x.117:7077] inbound addresses are [ak
ka.tcp://sparkMaster@cassandra-spark-1:7077]
akka.event.Logging$Error$NoCause$
16/01/06 19:33:52 INFO master.Master: 192.168.0.4:35598 got disassociated, removing it.
16/01/06 19:33:52 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@192.168.0.4:3559
8] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/01/06 19:33:52 INFO master.Master: 192.168.0.4:35598 got disassociated, removing it.
16/01/06 19:38:36 ERROR akka.ErrorMonitor: dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Act
or[akka.tcp://sparkMaster@192.168.0.4:7077/]] arriving at [akka.tcp://sparkMaster@192.168.0.4:7077] inbound addresses are [akka.t
cp://sparkMaster@cassandra-spark-1:7077]
akka.event.Logging$Error$NoCause$
16/01/06 19:38:56 INFO master.Master: 192.168.0.4:36078 got disassociated, removing it.
16/01/06 19:38:56 INFO master.Master: 192.168.0.4:36078 got disassociated, removing it.
16/01/06 19:38:56 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@192.168.0.4:3607
8] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
16/01/06 20:14:42 ERROR akka.ErrorMonitor: dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Act
or[akka.tcp://sparkMaster@x.x.x.117:7077/]] arriving at [akka.tcp://sparkMaster@x.x.x.117:7077] inbound addresses are [ak
ka.tcp://sparkMaster@cassandra-spark-1:7077]
akka.event.Logging$Error$NoCause$
16/01/06 20:15:02 INFO master.Master: 192.168.0.4:51079 got disassociated, removing it.
16/01/06 20:15:02 INFO master.Master: 192.168.0.4:51079 got disassociated, removing it.
16/01/06 20:15:02 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@192.168.0.4:5107
9] has failed, address is now gated for [5000] ms. Reason: [Disassociated]


我将不胜感激任何帮助.谢谢!


I would appreciate any help. Thanks!

推荐答案

确保您的spark安装上的conf/spark-defaults.conf具有主设置.类似于 spark.master spark://x.x.x.117:7077

Make sure conf/spark-defaults.conf on your spark installation has the master set. Something like spark.master spark://x.x.x.117:7077

这篇关于无法将pyspark连接到主机的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆