Spark在Yarn集群上运行exitCode = 13: [英] Spark runs on Yarn cluster exitCode=13:

查看:2620
本文介绍了Spark在Yarn集群上运行exitCode = 13:的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一个火花/纱线新手,当我在纱线群上提交一个点火作业时,碰到exitCode = 13。当火花作业在本地模式下运行时,一切正常。



我使用的命令是:

  / usr / hdp / current / spark-client / bin / spark-submit --class com.test.sparkTest  - 主线 - 部署模式集群--num-executors 40  - -executor-cores 4  - 驱动程序内存17g --executor-memory 22g --files /usr/hdp/current/spark-client/conf/hive-site.xml /home/user/sparkTest.jar* 

Spark错误日志

  16/04/12 17:59:30信息客户端:
客户端令牌:不适用
诊断:应用程序application_1459460037715_23007由于AM容器而失败2次appattempt_1459460037715_23007_000002使用exitCode退出:13
如需更详细的输出,请查看应用程序跟踪页面:http://b-r06f2-prod.phx2.cpe.net:8088 / cluster / app / application_1459460037715_23007然后,单击指向日志的链接每次尝试。
诊断:容器启动异常。
容器ID:container_e40_1459460037715_23007_02_000001
退出代码:13
堆栈轨迹:ExitCodeException exitCode = 13:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:576 )
at org.apache.hadoop.util.Shell.run(Shell.java:487)
at org.apache.hadoop.util.Shell $ ShellCommandExecutor.execute(Shell.java:753)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call (ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)


**纱线日志**

16/04/12 23:55:35信息mapreduce.TableInputFormatBase:输入分割长度:977 M字节。
16/04/12 23:55:41 INFO yarn.ApplicationMaster:等待spark上下文初始化...
16/04/12 23:55:51 INFO yarn.ApplicationMaster:等待spark上下文初始化...
16/04/12 23:56:01信息yarn.ApplicationMaster:等待火花上下文初始化...
16/04/12 23:56:11 INFO yarn.ApplicationMaster:等待spark的上下文初始化...
16/04/12 23:56:11 INFO client.ConnectionManager $ HConnectionImplementation:关闭zookeeper sessionid = 0x152f0b4fc0e7488
16/04/12 23:56:11 INFO zookeeper .ZooKeeper:Session:0x152f0b4fc0e7488 closed
16/04/12 23:56:11 INFO zookeeper.ClientCnxn:EventThread关闭
16/04/12 23:56:11信息executor.Executor:完成的任务0.0阶段1.0(TID 2)。 2003字节结果发送给驱动
16/04/12 23:56:11 INFO scheduler.TaskSetManager:完成任务0.0在阶段1.0(TID 2)在82134 ms在本地主机上(2/3)
16 / 04/12 23:56:17 INFO client.ConnectionManager $ HConnectionImplementation:关闭zookeeper sessionid = 0x4508c270df0980316 / 04/12 23:56:17 INFO zookeeper.ZooKeeper:Session:0x4508c270df09803已关闭*
...
16/04/12 23:56:21错误yarn.ApplicationMaster:SparkContext在等待100000毫秒后未初始化。请检查较早的日志输出是否有错误。申请失败。
16/04/12 23:56:21 INFO yarn.ApplicationMaster:最终应用程序状态:FAILED,exitCode:13,(原因:超时等待SparkContext。)
16/04/12 23: 56:21 INFO spark.SparkContext:从关闭钩子调用stop()*


解决方案看起来你已经将代码中的主设置为本地



SparkConf.setMaster(local [*] )



您必须让代码在代码中取消设置,稍后在发出 spark-提交



spark-submit --master yarn-client ...


I am a spark/yarn newbie, run into exitCode=13 when I submit a spark job on yarn cluster. When the spark job is running in local mode, everything is fine.

The command I used is:

/usr/hdp/current/spark-client/bin/spark-submit --class com.test.sparkTest --master yarn --deploy-mode cluster --num-executors 40 --executor-cores 4 --driver-memory 17g --executor-memory 22g --files /usr/hdp/current/spark-client/conf/hive-site.xml /home/user/sparkTest.jar*

Spark Error Log:

16/04/12 17:59:30 INFO Client:
         client token: N/A
         diagnostics: Application application_1459460037715_23007 failed 2 times due to AM Container for appattempt_1459460037715_23007_000002 exited with  exitCode: 13
For more detailed output, check application tracking page:http://b-r06f2-prod.phx2.cpe.net:8088/cluster/app/application_1459460037715_23007Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e40_1459460037715_23007_02_000001
Exit code: 13
Stack trace: ExitCodeException exitCode=13:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
        at org.apache.hadoop.util.Shell.run(Shell.java:487)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)


**Yarn logs**

    16/04/12 23:55:35 INFO mapreduce.TableInputFormatBase: Input split length: 977 M bytes.
16/04/12 23:55:41 INFO yarn.ApplicationMaster: Waiting for spark context initialization ...
16/04/12 23:55:51 INFO yarn.ApplicationMaster: Waiting for spark context initialization ...
16/04/12 23:56:01 INFO yarn.ApplicationMaster: Waiting for spark context initialization ...
16/04/12 23:56:11 INFO yarn.ApplicationMaster: Waiting for spark context initialization ...
16/04/12 23:56:11 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x152f0b4fc0e7488
16/04/12 23:56:11 INFO zookeeper.ZooKeeper: Session: 0x152f0b4fc0e7488 closed
16/04/12 23:56:11 INFO zookeeper.ClientCnxn: EventThread shut down
16/04/12 23:56:11 INFO executor.Executor: Finished task 0.0 in stage 1.0 (TID 2). 2003 bytes result sent to driver
16/04/12 23:56:11 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 82134 ms on localhost (2/3)
16/04/12 23:56:17 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x4508c270df0980316/04/12 23:56:17 INFO zookeeper.ZooKeeper: Session: 0x4508c270df09803 closed *
...
    16/04/12 23:56:21 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
16/04/12 23:56:21 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Timed out waiting for SparkContext.)
16/04/12 23:56:21 INFO spark.SparkContext: Invoking stop() from shutdown hook *

解决方案

It seems that you have set the master in your code to be local

SparkConf.setMaster("local[*]")

You have to let the master unset in the code, and set it later by when you issue spark-submit

spark-submit --master yarn-client ...

这篇关于Spark在Yarn集群上运行exitCode = 13:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆