星火1.3.0:运行PI例如纱线失败 [英] Spark 1.3.0: Running Pi example on YARN fails

查看:689
本文介绍了星火1.3.0:运行PI例如纱线失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Hadoop的2.6.0.2.2.0.0-2041 蜂巢0.14.0.2.2.0.0-2041

建设星火带命令后:

  MVN -Pyarn -Phadoop-2.4 -Dhadoop.version = 2.6.0 -Phive -Phive-thriftserver -DskipTests包

我尝试使用以下命令纱线运行丕例如:

 出口HADOOP_CONF_DIR =的/ etc / Hadoop的/ conf目录
在/ var / HOME2 /测试/火花/斌/火花提交\\
--class org.apache.spark.examples.SparkPi \\
--master纱线集群\\
--executor内存3G \\
--num-50执行人\\
HDFS:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \\
1000

我得到异常: application_1427875242006_0029失败2次,由于AM的容器appattempt_1427875242006_0029_000002,退出code退出:1 这实际上是诊断:从容器发射异常(请参阅下面登录)。

应用程序跟踪URL显示以下信息:

  java.lang.Exception的:未知的容器。容器或者尚未开始或已经完成或不属于该节点在所有的

和也:

 错误:无法找到或加载主类org.apache.spark.deploy.yarn.ApplicationMaster

我的Hadoop上4个节点,并完全地处于亏损状态如何使纱线星火工作的正常工作。

我应该设置 spark.yarn.access.namenodes 星火配置属性?虽然我的应用程序并不需要访问任何名义直接节点,但也许这将解决这个问题?

请到哪里找,任何想法将是很大的帮助指教,谢谢!

 星火装配已建成蜂巢,包括类路径DataNucleus的罐子
15/04/06 10:53:40 WARN util.Native codeLoader:无法加载原生的Hadoop库平台...使用内置-java类适用
15/04/06 10时53分42秒INFO impl.TimelineClientImpl:时间轴服务地址:http://etl-hdp-yarn.foo.bar.com:8188/ws/v1/timeline/
15/04/06 10时53分42秒INFO client.RMProxy:连接到ResourceManager中的etl-hdp-yarn.foo.bar.com/192.168.0.16:8050
15/04/06 10时53分42秒INFO yarn.Client中请求集群4 NodeManagers一个新的应用
15/04/06 10时53分42秒INFO yarn.Client:验证我们的应用程序并不比集群的最大内存容量要求更高(每个集装箱4096 MB)
15/04/06 10时53分42秒INFO yarn.Client:将分配AM容器,具有896 MB内存,包括384 MB的开销
15/04/06 10时53分42秒INFO yarn.Client:设置容器背景下推出为我们AM
15/04/06 10时53分42秒INFO yarn.Client:preparing我们的AM容器资源
15/04/06 10点53分43秒WARN shortcircuit.DomainSocketFactory:短路本地读,不能使用的功能,因为libhadoop无法加载。
15/04/06 10点53分43秒INFO yarn.Client:上传资源文件:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2。 6.0.jar - > hdfs://etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar
15/04/06 10点53分44秒INFO yarn.Client:源文件和目标文件系统是相同的。没有复制HDFS:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar
15/04/06 10点53分44秒INFO yarn.Client:我们的AM容器设置的发射环境
15/04/06 10点53分44秒INFO spark.SecurityManager:更改视图的ACL:测试
15/04/06 10点53分44秒INFO spark.SecurityManager:更改修改ACL来:测试
15/04/06 10点53分44秒INFO spark.SecurityManager:SecurityManager的:认证禁用; UI的ACL禁止;用户查看权限:设置(测试);用户修改权限:设置(测试)
15/04/06 10点53分44秒INFO yarn.Client:提交申请29的ResourceManager
15/04/06 10点53分44秒INFO impl.YarnClientImpl:提交的申请application_1427875242006_0029
15/04/06 10点53分45秒INFO yarn.Client:为application_1427875242006_0029申请报告(状态:已接受)
15/04/06 10点53分45秒INFO yarn.Client:
     客户端令牌:N / A
     诊断:N / A
     ApplicationMaster主持人:N / A
     ApplicationMaster RPC端口:-1
     队列:默认
     开始时间:1428317623905
     最终状态:未定义
     跟踪网址:http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/
     用户:测试
15/04/06 10点53分46秒INFO yarn.Client:为application_1427875242006_0029申请报告(状态:已接受)
15/04/06 10点53分47秒INFO yarn.Client:为application_1427875242006_0029申请报告(状态:已接受)
15/04/06 10点53分48秒INFO yarn.Client:为application_1427875242006_0029申请报告(状态:已接受)
15/04/06十点53分49秒INFO yarn.Client:为application_1427875242006_0029申请报告(状态:失败)
15/04/06十点53分49秒INFO yarn.Client:
     客户端令牌:N / A
     诊断:应用application_1427875242006_0029失败,原因是2倍到AM的容器appattempt_1427875242006_0029_000002,退出code退出:1
有关更详细的输出结果,检查应用程序跟踪页面:http://etl-hdp-yarn.foo.bar.com:8088 /代理/ application_1427875242006_0029 /然后,点击链接,每次尝试的日志。
诊断:从容器发射例外。
容器ID:container_1427875242006_0029_02_000001
退出code:1
异常消息: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/launch_container.sh:第27行: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/ma$p$pduce/*:$PWD/mr-framework/hadoop/share/hadoop/ma$p$pduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:坏替代堆栈跟踪:退出$ C $退出了CException code = 1: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/launch_container.sh:第27行: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/ma$p$pduce/*:$PWD/mr-framework/hadoop/share/hadoop/ma$p$pduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:坏替代    在org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    在org.apache.hadoop.util.Shell.run(Shell.java:455)
    在org.apache.hadoop.util.Shell $ ShellCommandExecutor.execute(Shell.java:715)
    在org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    在org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    在org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    在java.util.concurrent.FutureTask.run(FutureTask.java:262)
    在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    在java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)
    在java.lang.Thread.run(Thread.java:745)
集装箱退出以非零退出code 1
如果做不到这一点的尝试。失败的应用程序。
     ApplicationMaster主持人:N / A
     ApplicationMaster RPC端口:-1
     队列:默认
     开始时间:1428317623905
     最终状态:失败
     跟踪网址:http://etl-hdp-yarn.foo.bar.com:8088/cluster/app/application_1427875242006_0029
     用户:测试
异常线程mainorg.apache.spark.SparkException:应用程序完成与失败状态
    在org.apache.spark.deploy.yarn.Client.run(Client.scala:622)
    在org.apache.spark.deploy.yarn.Client $。主要(Client.scala:647)
    在org.apache.spark.deploy.yarn.Client.main(Client.scala)
    在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)
    在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    在java.lang.reflect.Method.invoke(Method.java:606)
    在org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
    在org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:166)
    在org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:189)
    在org.apache.spark.deploy.SparkSubmit $。主要(SparkSubmit.scala:110)
    在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


解决方案

如果您使用的HDP火花,那么我们必须做以下的事情。


  1. 在添加这些条目您的 $ SPARK_HOME / conf目录/火花defaults.conf

    spark.driver.extraJavaOptions -Dhdp.version = 2.2.0.0-2041 (您安装的HDP版本)

    spark.yarn.am.extraJavaOptions -Dhdp.version = 2.2.0.0-2041 (您安装的HDP版本)


  2. 创建 Java的OPTS 文件中的 $ SPARK_HOME / conf目录并添加在该文件中已安装的HDP版本像


-Dhdp.version = 2.2.0.0-2041 (您安装的HDP版本)

要知道HDP优化版本,请在群集中运行命令的 HDP选择状态Hadoop的客户端

I have Hadoop 2.6.0.2.2.0.0-2041 with Hive 0.14.0.2.2.0.0-2041 After building Spark with command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver -DskipTests package

I try to run Pi example on YARN with the following command:

export HADOOP_CONF_DIR=/etc/hadoop/conf
/var/home2/test/spark/bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--executor-memory 3G \
--num-executors 50 \
hdfs:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \
1000

I get exceptions: application_1427875242006_0029 failed 2 times due to AM Container for appattempt_1427875242006_0029_000002 exited with exitCode: 1 Which in fact is Diagnostics: Exception from container-launch.(please see log below).

Application tracking url reveals the following messages:

java.lang.Exception: Unknown container. Container either has not started or has already completed or doesn't belong to this node at all

and also:

Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster

I have Hadoop working fine on 4 nodes and completly at a loss how to make Spark work on YARN.

Should I set spark.yarn.access.namenodes Spark configuration property? Though my application does not need to access any name nodes directly, but maybe this will solve the problem?

Please advise where to look for, any ideas would be of great help, thank you!

Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/04/06 10:53:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/04/06 10:53:42 INFO impl.TimelineClientImpl: Timeline service address: http://etl-hdp-yarn.foo.bar.com:8188/ws/v1/timeline/
15/04/06 10:53:42 INFO client.RMProxy: Connecting to ResourceManager at etl-hdp-yarn.foo.bar.com/192.168.0.16:8050
15/04/06 10:53:42 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers
15/04/06 10:53:42 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (4096 MB per container)
15/04/06 10:53:42 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
15/04/06 10:53:42 INFO yarn.Client: Setting up container launch context for our AM
15/04/06 10:53:42 INFO yarn.Client: Preparing resources for our AM container
15/04/06 10:53:43 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
15/04/06 10:53:43 INFO yarn.Client: Uploading resource file:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar -> hdfs://etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar
15/04/06 10:53:44 INFO yarn.Client: Source and destination file systems are the same. Not copying hdfs:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar
15/04/06 10:53:44 INFO yarn.Client: Setting up the launch environment for our AM container
15/04/06 10:53:44 INFO spark.SecurityManager: Changing view acls to: test
15/04/06 10:53:44 INFO spark.SecurityManager: Changing modify acls to: test
15/04/06 10:53:44 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(test); users with modify permissions: Set(test)
15/04/06 10:53:44 INFO yarn.Client: Submitting application 29 to ResourceManager
15/04/06 10:53:44 INFO impl.YarnClientImpl: Submitted application application_1427875242006_0029
15/04/06 10:53:45 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:45 INFO yarn.Client: 
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1428317623905
     final status: UNDEFINED
     tracking URL: http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/
     user: test
15/04/06 10:53:46 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:47 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:48 INFO yarn.Client: Application report for application_1427875242006_0029 (state: ACCEPTED)
15/04/06 10:53:49 INFO yarn.Client: Application report for application_1427875242006_0029 (state: FAILED)
15/04/06 10:53:49 INFO yarn.Client: 
     client token: N/A
     diagnostics: Application application_1427875242006_0029 failed 2 times due to AM Container for appattempt_1427875242006_0029_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1427875242006_0029_02_000001
Exit code: 1
Exception message: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

Stack trace: ExitCodeException exitCode=1: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1428317623905
     final status: FAILED
     tracking URL: http://etl-hdp-yarn.foo.bar.com:8088/cluster/app/application_1427875242006_0029
     user: test
Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:622)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

解决方案

If you are using spark with hdp, then we have to do following things.

  1. Add these entries in your $SPARK_HOME/conf/spark-defaults.conf

    spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

    spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

  2. create java-opts file in $SPARK_HOME/conf and add the installed HDP version in that file like

-Dhdp.version=2.2.0.0-2041 (your installed HDP version)

to know hdp verion please run command hdp-select status hadoop-client in the cluster

这篇关于星火1.3.0:运行PI例如纱线失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆