星火提交例外SparkException:作业已中止由于舞台故障 [英] Spark-Submit exception SparkException: Job aborted due to stage failure

查看:286
本文介绍了星火提交例外SparkException:作业已中止由于舞台故障的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

每当我试图运行一个像一个火花提交命令下面,我得到一个例外。请可能有人认为这是怎么回事错在这里。

我的命令:

 火花提交--class com.xyz.MyTestClass --master火花://<火花主IP>:7077 SparkTest.jar在线程异常主要org.apache.spark.SparkException:作业已中止由于舞台失败:任务0.0:0失败了4次,最近一次失败:TID 7主机<&主机GT;失败原因不明
驱动程序堆栈跟踪:
        在org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
        在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ abortStage $ 1.适用(DAGScheduler.scala:1017)
        在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ abortStage $ 1.适用(DAGScheduler.scala:1015)
        在scala.collection.mutable.ResizableArray $ class.foreach(ResizableArray.scala:59)
        在scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        在org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
        在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ handleTaskSetFailed $ 1.适用(DAGScheduler.scala:633)
        在org.apache.spark.scheduler.DAGScheduler $$ anonfun $ handleTaskSetFailed $ 1.适用(DAGScheduler.scala:633)
        在scala.Option.foreach(Option.scala:236)
        在org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
        在org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
        在akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
        在akka.actor.ActorCell.invoke(ActorCell.scala:456)
        在akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
        在akka.dispatch.Mailbox.run(Mailbox.scala:219)
        在akka.dispatch.ForkJoinExecutorConfigurator $ AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        在scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        在scala.concurrent.forkjoin.ForkJoinPool $ WorkQueue.runTask(ForkJoinPool.java:1339)
        在scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        在scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


解决方案

我不知道,如果参数

   - 主火花://<火花主IP>:7077

是你的真正的写,而不是主节点的实际的 IP 的。如果是这样,你应该改变它和主人的键入的 IP 的或公共的 DNS 的,如:


  

- 主火花://ec2-XX-XX-XX-XX.eu-west-1.compute.amazonaws.com:7077


如果这不是这种情况,我会AP preciate如果你能提供有关应用程序错误的详细信息,正如指出的上述评论。另外,还要确保的 - 类的参数指向实际的主类的应用程序

Whenever I tried to run a spark-submit command like the one below I'm getting an exception. Please could someone suggest what's going wrong here.

My command:

spark-submit --class com.xyz.MyTestClass --master spark://<spark-master-IP>:7077  SparkTest.jar

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:0 failed 4 times, most recent failure: TID 7 on host <hostname> failed for unknown reason
Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
        at akka.actor.ActorCell.invoke(ActorCell.scala:456)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
        at akka.dispatch.Mailbox.run(Mailbox.scala:219)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

解决方案

I am not sure if the parameter

--master spark://<spark-master-IP>:7077

is what you actually have written instead of the actual IP of the master node. If so, you should change it and type the IP or public DNS of the master, such as:

--master spark://ec2-XX-XX-XX-XX.eu-west-1.compute.amazonaws.com:7077

If that's not the case, I would appreciate if you could provide more information about the error of the application, just as pointed on the comments above. Also make sure that the --class parameter points to the actual main class of the application.

这篇关于星火提交例外SparkException:作业已中止由于舞台故障的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆