org.apache.spark.SparkException:作业由于阶段失败而中止:来自应用程序的任务 [英] org.apache.spark.SparkException: Job aborted due to stage failure: Task from application

查看:153
本文介绍了org.apache.spark.SparkException:作业由于阶段失败而中止:来自应用程序的任务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在独立集群上运行spark应用程序时遇到问题. (我使用spark 1.1.0版本). 我通过命令成功运行了主服务器:

I have a problem with running spark application on standalone cluster. (I use spark 1.1.0 version). I succesfully run master server by command:

bash start-master.sh 

然后我通过命令运行一个工人:

Then I run one worker by command:

bash spark-class org.apache.spark.deploy.worker.Worker spark://fujitsu11:7077

在主用户的网络用户界面上:

At master’s web UI:

http://localhost:8080  

我知道,主人和工人正在运行.

I see, that master and worker are running.

然后,我从Eclipse Luna运行我的应用程序.我通过命令成功连接到集群

Then I run my application from Eclipse Luna. I successfully connect to cluster by command

JavaSparkContext sc = new JavaSparkContext("spark://fujitsu11:7077", "myapplication");

在该应用程序正常工作之后,但是当程序实现以下代码时:

And after that application works, but when program achieve following code:

 JavaRDD<Document> collectionRdd = sc.parallelize(list);

它崩溃并显示以下错误消息:

It's crashing with following error message:

 org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 4 times, most recent failure: Lost task 7.3 in stage 0.0 (TID 11, fujitsu11.inevm.ru):java.lang.ClassNotFoundException: maven.maven1.Document
 java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 java.security.AccessController.doPrivileged(Native Method)
 java.net.URLClassLoader.findClass(URLClassLoader.java:354)
  java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    java.lang.Class.forName0(Native Method)
    java.lang.Class.forName(Class.java:270)
    org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)
    java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
    java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
    org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:74)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    java.lang.reflect.Method.invoke(Method.java:606)
    java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
    org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
    org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
    org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
    java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    java.lang.Thread.run(Thread.java:744)
 Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

在外壳中,我发现:

14/11/12 18:46:06 INFO ExecutorRunner: Launch command: "C:\PROGRA~1\Java\jdk1.7.0_51/bin/java"  "-cp" ";;D:\spark\bin\..\conf;D:\spark\bin\..\lib\spark-assembly-
1.1.0-hadoop1.0.4.jar;;D:\spark\bin\..\lib\datanucleus-api-jdo-3.2.1.jar;D:\spar
k\bin\..\lib\datanucleus-core-3.2.2.jar;D:\spark\bin\..\lib\datanucleus-rdbms-3.
2.1.jar" "-XX:MaxPermSize=128m" "-Dspark.driver.port=50913" "-Xms512M" "-Xmx512M
" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://sparkDriv
er@fujitsu11.inevm.ru:50913/user/CoarseGrainedScheduler" "0" "fujitsu11.inevm.ru
" "8" "akka.tcp://sparkWorker@fujitsu11.inevm.ru:50892/user/Worker" "app-2014111
2184605-0000"
14/11/12 18:46:40 INFO Worker: Asked to kill executor app-20141112184605-0000/0
14/11/12 18:46:40 INFO ExecutorRunner: Runner thread for executor app-2014111218
4605-0000/0 interrupted
14/11/12 18:46:40 INFO ExecutorRunner: Killing process!
14/11/12 18:46:40 INFO Worker: Executor app-20141112184605-0000/0 finished with
state KILLED exitStatus 1
14/11/12 18:46:40 INFO LocalActorRef: Message [akka.remote.transport.ActorTransp
ortAdapter$DisassociateUnderlying] from Actor[akka://sparkWorker/deadLetters] to
Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtoco
l-tcp%3A%2F%2FsparkWorker%40192.168.3.5%3A50955-2#1066511138] was not delivered.
[1] dead letters encountered. This logging can be turned off or adjusted with c
onfiguration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-
shutdown'.
14/11/12 18:46:40 INFO LocalActorRef: Message [akka.remote.transport.Association
Handle$Disassociated] from Actor[akka://sparkWorker/deadLetters] to Actor[akka:/
/sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2
FsparkWorker%40192.168.3.5%3A50955-2#1066511138] was not delivered. [2] dead let
ters encountered. This logging can be turned off or adjusted with configuration
settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/11/12 18:46:41 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker
@fujitsu11.inevm.ru:50892] -> [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:50954
]: Error [Association failed with [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:5
0954]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sp
arkExecutor@fujitsu11.inevm.ru:50954]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon
$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:5
0954
]
14/11/12 18:46:42 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker
@fujitsu11.inevm.ru:50892] -> [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:50954
]: Error [Association failed with [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:5
0954]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sp
arkExecutor@fujitsu11.inevm.ru:50954]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon
$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:5
0954
]
14/11/12 18:46:43 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker
@fujitsu11.inevm.ru:50892] -> [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:50954
]: Error [Association failed with [akka.tcp://sparkExecutor@fujitsu11.inevm.ru:5
0954]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sp
arkExecutor@fujitsu11.inevm.ru:50954]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon
$2: Connection refused: no further information: fujitsu11.inevm.ru/192.168.3.5:5
0954
]

在日志中:

14/11/12 18:46:41 ERROR EndpointWriter: AssociationError    [akka.tcp://sparkMaster@fujitsu11:7077]     -> [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]:   Error [Association failed with [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]] [
akka.remote.EndpointAssociationException: Association failed with   [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection  refused: no further information: fujitsu11.inevm.ru/192.168.3.5:50913
]
14/11/12 18:46:42 INFO Master: akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913 got disassociated,   removing it.
14/11/12 18:46:42 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster@fujitsu11:7077] -> [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]: Error [Association failed with   [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]] [
akka.remote.EndpointAssociationException: Association failed with   [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection  refused: no further information: fujitsu11.inevm.ru/192.168.3.5:50913
]
14/11/12 18:46:43 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster@fujitsu11:7077] -> [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]: Error [Association failed with   [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]] [
akka.remote.EndpointAssociationException: Association failed with   [akka.tcp://sparkDriver@fujitsu11.inevm.ru:50913]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection  refused: no further information: fujitsu11.inevm.ru/192.168.3.5:50913
]

我在Google上搜索了很多,但不知道怎么了... 我在这里发现了一些类似的讨论:

I googled a lot but I have no idea whats wrong... I found a bit similar discussion here:

https://github.com/datastax/spark-cassandra-connector/issues/187

但这不能解决我的问题...

But it doesn't solve my problem...

有人知道怎么了吗?

谢谢.

推荐答案

找到了一种使用IDE/Maven运行它的方法

Found a way to run it using IDE / Maven

  1. 创建一个胖子罐(一个包含所有依赖项的胖子).为此使用阴影插件. pom示例:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.2</version>
    <configuration>
        <filters>
            <filter>
                <artifact>*:*</artifact>
                <excludes>
                    <exclude>META-INF/*.SF</exclude>
                    <exclude>META-INF/*.DSA</exclude>
                    <exclude>META-INF/*.RSA</exclude>
                </excludes>
            </filter>
        </filters>
    </configuration>
    <executions>
        <execution>
            <id>job-driver-jar</id>
            <phase>package</phase>
            <goals>
                <goal>shade</goal>
            </goals>
            <configuration>
                <shadedArtifactAttached>true</shadedArtifactAttached>
                <shadedClassifierName>driver</shadedClassifierName>
                <transformers>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                    <!--
                    Some care is required:
                    http://doc.akka.io/docs/akka/snapshot/general/configuration.html
                    -->
                    <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                        <resource>reference.conf</resource>
                    </transformer>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                        <mainClass>mainClass</mainClass>
                    </transformer>
                </transformers>
            </configuration>
        </execution>
        <execution>
            <id>worker-library-jar</id>
            <phase>package</phase>
            <goals>
                <goal>shade</goal>
            </goals>
            <configuration>
                <shadedArtifactAttached>true</shadedArtifactAttached>
                <shadedClassifierName>worker</shadedClassifierName>
                <transformers>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                </transformers>
            </configuration>
        </execution>
    </executions>
</plugin>

  1. 现在,我们必须将已编译的jar文件发送到集群.为此,请在spark配置中指定jar文件,如下所示:

SparkConf conf =新 SparkConf().setAppName("appName").setMaster("spark://machineName:7077").setJars(new String [] {"target/appName-1.0-SNAPSHOT-driver.jar"});

SparkConf conf = new SparkConf().setAppName("appName").setMaster("spark://machineName:7077").setJars(new String[] {"target/appName-1.0-SNAPSHOT-driver.jar"});

  1. 运行mvn clean软件包以创建Jar文件.它将在您的目标文件夹中创建.

  1. Run mvn clean package to create the Jar file. It will be created in your target folder.

使用您的IDE或使用maven命令运行:

Run using your IDE or using maven command :

mvn exec:java -Dexec.mainClass ="className"

mvn exec:java -Dexec.mainClass="className"

这不需要提交火花.只记得运行前先打包文件

This does not require spark-submit. Just remember to package file before running

如果您不想对jar路径进行硬编码,则可以执行以下操作:

If you don't want to hardcode the jar path, you can do this :

  1. 在配置中,输入:

SparkConf conf =新的SparkConf() .setAppName("appName") .setMaster("spark://machineName:7077") .setJars(JavaSparkContext.jarOfClass(this.getClass()));

SparkConf conf = new SparkConf() .setAppName("appName") .setMaster("spark://machineName:7077") .setJars(JavaSparkContext.jarOfClass(this.getClass()));

  1. 创建胖子罐(如上)并在运行package命令后使用maven运行:

java -jar target/application-1.0-SNAPSHOT-driver.jar

java -jar target/application-1.0-SNAPSHOT-driver.jar

这将从加载类的jar中获取jar.

This will take the jar from the jar the class was loaded.

这篇关于org.apache.spark.SparkException:作业由于阶段失败而中止:来自应用程序的任务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆