AWS EMR在集群模式下使用启动步骤.应用程序application_状态已失败 [英] AWS EMR using spark steps in cluster mode. Application application_ finished with failed status

查看:193
本文介绍了AWS EMR在集群模式下使用启动步骤.应用程序application_状态已失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用AWS Cli启动集群.我使用以下命令:

I'm trying to launch a cluster using AWS Cli. I use the following command:

aws emr create-cluster --name "Config1" --release-label emr-5.0.0 --applications Name=Spark --use-default-role --log-uri 's3://aws-logs-813591802533-us-west-2/elasticmapreduce/' --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m1.medium InstanceGroupType=CORE,InstanceCount=2,InstanceType=m1.medium

集群创建成功.然后添加以下命令:

The cluster is created successfully. Then I add this command:

aws emr add-steps --cluster-id ID_CLUSTER --region us-west-2 --steps Name=SparkSubmit,Jar="command-runner.jar",Args=[spark-submit,--deploy-mode,cluster,--master,yarn,--executor-memory,1G,--class,Traccia2014,s3://tracceale/params/scalaProgram.jar,s3://tracceale/params/configS3.txt,30,300,2,"s3a://tracceale/Tempi1"],ActionOnFailure=CONTINUE

一段时间后,该步骤失败.这是LOG文件:

After some time, the step failed. This is the LOG file:

 17/02/22 11:00:07 INFO RMProxy: Connecting to ResourceManager at ip-172-31-  31-190.us-west-2.compute.internal/172.31.31.190:8032
 17/02/22 11:00:08 INFO Client: Requesting a new application from cluster with 2 NodeManagers
 17/02/22 11:00:08 INFO Client: Verifying our application has not requested  
 Exception in thread "main" org.apache.spark.SparkException: Application application_1487760984275_0001 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1132)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1175)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 17/02/22 11:01:02 INFO ShutdownHookManager: Shutdown hook called
 17/02/22 11:01:02 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-27baeaa9-8b3a-4ae6-97d0-abc1d3762c86
 Command exiting with ret '1'

在本地运行(在SandBox Hortonworks HDP 2.5上),我运行:

Locally (on SandBox Hortonworks HDP 2.5) I run:

./spark-submit --class Traccia2014 --master local[*] --executor-memory 2G /usr/hdp/current/spark2-client/ScalaProjects/ScripRapportoBatch2.1/target/scala-2.11/traccia-22-ottobre_2.11-1.0.jar "/home/tracce/configHDFS.txt" 30 300 3

,一切正常. 我已经读过一些与我的问题有关的东西,但是我无法弄清楚.

and everything works fine. I've already read something related to my problem, but I can't figure it out.

更新

签入Application Master时,出现此错误:

Checked into Application Master, I get this error:

17/02/22 15:29:54 ERROR ApplicationMaster: User class threw exception: java.io.FileNotFoundException: s3:/tracceale/params/configS3.txt (No such file or directory)

at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at scala.io.Source$.fromFile(Source.scala:91)
at scala.io.Source$.fromFile(Source.scala:76)
at scala.io.Source$.fromFile(Source.scala:54)
at Traccia2014$.main(Rapporto.scala:40)
at Traccia2014.main(Rapporto.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)
 17/02/22 15:29:55 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.io.FileNotFoundException: s3:/tracceale/params/configS3.txt (No such file or directory))

我将提到的路径"s3://tracceale/params/configS3.txt"从S3传递到函数"fromFile",如下所示:

I pass the path mentioned "s3://tracceale/params/configS3.txt" from S3 to the function 'fromFile' like this:

for(line <- scala.io.Source.fromFile(logFile).getLines())

我该如何解决?预先感谢.

How could I solve it? Thanks in advance.

推荐答案

该位置可能存在文件丢失的可能性,也许是在ssh进入EMR群集后可以看到它,但是steps命令仍然无法自己找出来并开始抛出该文件未找到异常.

There is a probability of file missing in the location, may be you can see it after ssh into EMR cluster but still the steps command wouldn't be able to figure out by itself and starts throwing that file not found exception.

在这种情况下,我要做的是:

In this scenario what I did is :

Step 1: Checked for the file existence in the project directory which we copied to EMR.

for example mine was in `//usr/local/project_folder/`

Step 2: Copy the script which you're expecting to run on the EMR.

for example I copied from `//usr/local/project_folder/script_name.sh` to `/home/hadoop/`

Step 3: Then executed the script from /home/hadoop/ by passing the absolute path to the command-runner.jar

command-runner.jar bash /home/hadoop/script_name.sh

因此,我发现我的脚本正在运行.希望这可能对某人有所帮助

Thus I found my script running. Hope this may be helpful to someone

这篇关于AWS EMR在集群模式下使用启动步骤.应用程序application_状态已失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆