Spark Shell - __spark_libs__.zip 不存在 [英] Spark Shell - __spark_libs__.zip does not exist

查看:47
本文介绍了Spark Shell - __spark_libs__.zip 不存在的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Spark 的新手,我正在忙着设置启用 HA 的 Spark 集群.

当通过以下方式启动 spark shell 进行测试时:bash spark-shell --master yarn --deploy-mode client

我收到以下错误(请参阅下面的完整错误):file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip 不存在

该应用程序在 yarn web 应用程序上被标记为失败,并且没有启动任何容器.

当通过以下方式启动 shell 时:spark-shell --master local 它打开时没有错误.

我注意到文件只被写入到创建 shell 的节点上的 tmp 文件夹中.

任何帮助将不胜感激.如果需要更多信息,请告诉我.

环境变量:

<块引用>

HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/

YARN_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/

SPARK_HOME=/opt/spark-2.0.2-bin-hadoop2.7/

完整的错误信息:

16/11/30 21:08:47 WARN util.NativeCodeLoader: 无法为您的平台加载 native-hadoop 库...在适用的情况下使用内置 Java 类16/11/30 21:08:49 WARN yarn.Client:spark.yarn.jars 和 spark.yarn.archive 都没有设置,退回到在 SPARK_HOME 下上传库.16/11/30 21:09:03 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint:容器标记为失败:container_e14_1480532715390_0001_02_000003 主机:slave2.退出状态:-1000.诊断:文件 file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip 不存在java.io.FileNotFoundException:文件文件:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip不存在在 org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)在 org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)在 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)在 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)在 org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)在 org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)在 org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)在 org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)在 java.security.AccessController.doPrivileged(Native Method)在 javax.security.auth.Subject.doAs(Subject.java:422)在 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)在 org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)在 org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)在 java.util.concurrent.FutureTask.run(FutureTask.java:266)在 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)在 java.util.concurrent.FutureTask.run(FutureTask.java:266)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在 java.lang.Thread.run(Thread.java:745)16/11/30 22:29:28 错误 cluster.YarnClientSchedulerBackend:Yarn 应用程序已经退出,状态为 FINISHED!16/11/30 22:29:28 错误 spark.SparkContext:初始化 SparkContext 时出错.java.lang.IllegalStateException:Spark 上下文在等待后端时停止在 org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:584)在 org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:162)在 org.apache.spark.SparkContext.(SparkContext.scala:546)在 org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258)在 org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)在 org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)在 scala.Option.getOrElse(Option.scala:121)在 org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)在 org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)在 $line3.$read$$iw$$iw.(:15)在 $line3.$read$$iw.(:31)在 $line3.$read.(:33)在 $line3.$read$.(:37)在 $line3.$read$.()在 $line3.$eval$.$print$lzycompute(:7)在 $line3.$eval$.$print(:6)在 $line3.$eval.$print(<控制台>)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:498)在 scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)在 scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)在 scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)在 scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)在 scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)在 scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)在 scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)在 scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)在 scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)在 scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)在 scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)在 scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)在 org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38)在 org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)在 org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)在 scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)在 org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)在 org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:94)在 scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920)在 scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)在 scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)在 scala.reflect.internal.util.ScalaClassLoader$. SavingContextLoader(ScalaClassLoader.scala:97)在 scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)在 org.apache.spark.repl.Main$.doMain(Main.scala:68)在 org.apache.spark.repl.Main$.main(Main.scala:51)在 org.apache.spark.repl.Main.main(Main.scala)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:498)在 org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)在 org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)在 org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)在 org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)在 org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

yarn-site.xml

<预><代码><配置><财产><name>yarn.resourcemanager.connect.retry-interval.ms</name><value>2000</value></属性><财产><name>yarn.resourcemanager.ha.enabled</name><值>真</值></属性><财产><name>yarn.resourcemanager.ha.automatic-failover.enabled</name><值>真</值></属性><财产><name>yarn.resourcemanager.ha.automatic-failover.embedded</name><值>真</值></属性><财产><name>yarn.resourcemanager.cluster-id</name><value>纱线簇</value></属性><财产><name>yarn.resourcemanager.ha.rm-ids</name><value>rm1,rm2</value></属性><财产><name>yarn.resourcemanager.ha.id</name><value>rm1</value></属性><财产><name>yarn.resourcemanager.scheduler.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value></属性><财产><name>yarn.resourcemanager.recovery.enabled</name><值>真</值></属性><财产><name>yarn.resourcemanager.store.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value></属性><财产><name>yarn.resourcemanager.zk-address</name><value>master:2181,slave1:2181,slave2:2181</value></属性><财产><name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name><值>5000</值></属性><财产><name>yarn.resourcemanager.work-preserving-recovery.enabled</name><值>真</值></属性><财产><name>yarn.resourcemanager.address.rm1</name><value>master:23140</value></属性><财产><name>yarn.resourcemanager.scheduler.address.rm1</name><value>master:23130</value></属性><财产><name>yarn.resourcemanager.webapp.https.address.rm1</name><value>master:23189</value></属性><财产><name>yarn.resourcemanager.webapp.address.rm1</name><value>master:23188</value></属性><财产><name>yarn.resourcemanager.resource-tracker.address.rm1</name><value>master:23125</value></属性><财产><name>yarn.resourcemanager.admin.address.rm1</name><value>master:23141</value></属性><财产><name>yarn.resourcemanager.address.rm2</name><value>slave1:23140</value></属性><财产><name>yarn.resourcemanager.scheduler.address.rm2</name><value>slave1:23130</value></属性><财产><name>yarn.resourcemanager.webapp.https.address.rm2</name><value>slave1:23189</value></属性><财产><name>yarn.resourcemanager.webapp.address.rm2</name><value>slave1:23188</value></属性><财产><name>yarn.resourcemanager.resource-tracker.address.rm2</name><value>slave1:23125</value></属性><财产><name>yarn.resourcemanager.admin.address.rm2</name><value>slave1:23141</value></属性><财产><description>定位器 IPC 所在的地址.</description><name>yarn.nodemanager.localizer.address</name><value>0.0.0.0:23344</value></属性><财产><description>NM Webapp 地址.</description><name>yarn.nodemanager.webapp.address</name><value>0.0.0.0:23999</value></属性><财产><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></属性><财产><name>yarn.nodemanager.local-dirs</name><value>/tmp/pseudo-dist/yarn/local</value></属性><财产><name>yarn.nodemanager.log-dirs</name><value>/tmp/pseudo-dist/yarn/log</value></属性><财产><name>mapreduce.shuffle.port</name><value>23080</value></属性><财产><name>yarn.resourcemanager.work-preserving-recovery.enabled</name><值>真</值></属性></配置>

解决方案

此错误是由 core-site.xml 文件中的配置引起的.

<块引用>

请注意,要找到此文件,您的 HADOOP_CONF_DIR 环境变量必须设置.

就我而言,我将 HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/ 添加到./conf/spark-env.sh

参见:Spark在 Yarn Cluster 上运行的作业 java.io.FileNotFoundException: File does not exits ,即使文件在主节点上退出

core-site.xml

<预><代码><配置><财产><name>fs.default.name</name><value>hdfs://master:9000</value></属性></配置>

如果这个端点不可达,或者如果 Spark 检测到文件系统与当前系统相同,lib 文件将不会分发到集群中的其他节点,从而导致上述错误.

在我的情况下,我所在的节点无法访问指定主机上的端口 9000.

调试

将日志级别提高到 info.您可以通过以下方式执行此操作:

  1. ./conf/log4j.properties.template复制到./conf/log4j.properties

  2. 在文件集中log4j.logger.org.apache.spark.repl.Main = INFO

正常启动您的 Spark Shell.如果您的问题与我的相同,您应该会看到一条信息消息,例如:INFO Client:源文件系统和目标文件系统相同.不复制文件:/tmp/spark-c1a6cdcd-d348-4253-8755-5086a8931e75/__spark_libs__1391186608525933727.zip

这应该会导致您遇到问题,因为它会启动因丢失文件而导致的火车反应.

I'm new to Spark and I'm busy setting up a Spark Cluster with HA enabled.

When starting a spark shell for testing via: bash spark-shell --master yarn --deploy-mode client

I receive the following error (See full error bellow): file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip does not exist

The application is marked as failed on the yarn web app and no containers are started.

When starting a shell via: spark-shell --master local it opens without errors.

I have noticed that files are only being written to the tmp folder on the node where the shell is created.

Any help will be much appreciated. Let me know if more information is required.

Environment Variables:

HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/

YARN_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/

SPARK_HOME=/opt/spark-2.0.2-bin-hadoop2.7/

Full error message:

16/11/30 21:08:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/11/30 21:08:49 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 
16/11/30 21:09:03 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_e14_1480532715390_0001_02_000003 on host: slave2. Exit status: -1000. Diagnostics: File file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip does not exist 
java.io.FileNotFoundException: File file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip
does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

16/11/30 22:29:28 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED! 16/11/30 22:29:28 ERROR spark.SparkContext: Error initializing SparkContext. java.lang.IllegalStateException: Spark context stopped while waiting for backend
        at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:584)
        at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:162)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:546)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
        at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
        at $line3.$read$$iw$$iw.<init>(<console>:15)
        at $line3.$read$$iw.<init>(<console>:31)
        at $line3.$read.<init>(<console>:33)
        at $line3.$read$.<init>(<console>:37)
        at $line3.$read$.<clinit>(<console>)
        at $line3.$eval$.$print$lzycompute(<console>:7)
        at $line3.$eval$.$print(<console>:6)
        at $line3.$eval.$print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
        at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
        at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
        at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
        at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
        at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
        at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
        at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
        at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
        at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
        at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
        at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
        at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38)
        at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
        at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
        at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
        at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)
        at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:94)
        at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920)
        at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
        at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
        at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
        at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
        at org.apache.spark.repl.Main$.doMain(Main.scala:68)
        at org.apache.spark.repl.Main$.main(Main.scala:51)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

yarn-site.xml

<configuration>
  <property>
    <name>yarn.resourcemanager.connect.retry-interval.ms</name>
    <value>2000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>yarn-cluster</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm1,rm2</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.id</name>
    <value>rm1</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  </property>
  <property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>master:2181,slave1:2181,slave2:2181</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
    <value>5000</value>
  </property>
  <property>
    <name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
    <value>true</value>
  </property>

  <property>
    <name>yarn.resourcemanager.address.rm1</name>
    <value>master:23140</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm1</name>
    <value>master:23130</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.https.address.rm1</name>
    <value>master:23189</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>master:23188</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
    <value>master:23125</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address.rm1</name>
    <value>master:23141</value>
  </property>

  <property>
    <name>yarn.resourcemanager.address.rm2</name>
    <value>slave1:23140</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm2</name>
    <value>slave1:23130</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.https.address.rm2</name>
    <value>slave1:23189</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>slave1:23188</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
    <value>slave1:23125</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address.rm2</name>
    <value>slave1:23141</value>
  </property>

  <property>
    <description>Address where the localizer IPC is.</description>
    <name>yarn.nodemanager.localizer.address</name>
    <value>0.0.0.0:23344</value>
  </property>
  <property>
    <description>NM Webapp address.</description>
    <name>yarn.nodemanager.webapp.address</name>
    <value>0.0.0.0:23999</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/tmp/pseudo-dist/yarn/local</value>
  </property>
  <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>/tmp/pseudo-dist/yarn/log</value>
  </property>
  <property>
    <name>mapreduce.shuffle.port</name>
    <value>23080</value>
  </property>
  <property>
    <name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
    <value>true</value>
  </property>
</configuration>

解决方案

This error was due to the config in the core-site.xml file.

Please note that to find this file your HADOOP_CONF_DIR env variable must be set.

In my case I added HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/ to ./conf/spark-env.sh

See: Spark Job running on Yarn Cluster java.io.FileNotFoundException: File does not exits , eventhough the file exits on the master node

core-site.xml

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://master:9000</value>
    </property> 
</configuration>

If this endpoint is unreachable, or if Spark detects that the file system is the same as the current system, the lib files will not be distributed to the other nodes in your cluster causing the errors above.

In my situation the node I was on couldn't reach port 9000 on the specified host.

Debugging

Turn the log level up to info. You can do this by:

  1. Copy ./conf/log4j.properties.template to ./conf/log4j.properties

  2. In the file set log4j.logger.org.apache.spark.repl.Main = INFO

Start your Spark Shell as normal. If your issue is the same as mine, you should see an info message such as: INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-c1a6cdcd-d348-4253-8755-5086a8931e75/__spark_libs__1391186608525933727.zip

This should lead you to the problem as it starts the train reaction that results from the missing files.

这篇关于Spark Shell - __spark_libs__.zip 不存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆