独立火花集群。不能以编程方式提交的工作 - > java.io.InvalidClassException [英] Standalone spark cluster. Can't submit job programmatically -> java.io.InvalidClassException

查看:176
本文介绍了独立火花集群。不能以编程方式提交的工作 - > java.io.InvalidClassException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

星火的家伙,我是很新的火花,这就是为什么我希望你的帮助确实如此。

Spark fellows, I’m quite new to Spark, that’s why I hope for your help indeed.

我想从我的笔记本电脑计划的火花群集上的相当简单的工作。尽管它的工作原理,当我与提交./火花提交,它抛出一个异常,当我尝试以编程方式做到这一点。

I’m trying to schedule the quite simple job on spark cluster from my laptop. Despite it works, when I submit it with ./spark-submit, it throws an exception, when I try to do it programmatically.

环境:
- 星火 - 1主节点和2个工作节点(独立运行模式)。星火未编译,但二进制文件下载。星火版本 - 1.0.2
- Java版本1.7.0_45
- 应用程序JAR位于无处不在(在客户端和在同一个地方工作节点);
- README.md文件被复制到每个节点,以及;

Environment: - Spark - 1 master node and 2 worker nodes (standalone mode). Spark was not compiled, but the binaries were downloaded. Spark version - 1.0.2 - java version "1.7.0_45" - Application jar is located everywhere (on client and on worker nodes in the same place); - README.md file is copied to every node as well;

我想要的应用程序来运行:

The application I'm trying to run:

val logFile = "/user/vagrant/README.md"

val conf = new SparkConf()
conf.setMaster("spark://192.168.33.50:7077")
conf.setAppName("Simple App")
conf.setJars(List("file:///user/vagrant/spark-1.0.2-bin-hadoop1/bin/hello-apache-spark_2.10-1.0.0-SNAPSHOT.jar"))
conf.setSparkHome("/user/vagrant/spark-1.0.2-bin-hadoop1")

val sc = new SparkContext(conf)

val logData = sc.textFile(logFile, 2).cache()

...

所以,问题是,这个应用程序上运行的集群成功,当我这样做:

So the problem is, that this application runs on cluster successfully, when I do:

./spark-submit --class com.paycasso.SimpleApp --master spark://192.168.33.50:7077 --deploy-mode client file:///home/vagrant/spark-1.0.2-bin-hadoop1/bin/hello-apache-spark_2.10-1.0.0-SNAPSHOT.jar

但它不工作,当我试图通过调用编程方式做同样的 SBT运行

下面是堆栈跟踪,我得到主节点上:

Here is the stacktrace, that I get on master node:

14/09/04 15:09:44 ERROR Remoting: org.apache.spark.deploy.ApplicationDescription; local class incompatible: stream classdesc serialVersionUID = -6451051318873184044, local class serialVersionUID = 583745679236071411
java.io.InvalidClassException: org.apache.spark.deploy.ApplicationDescription; local class incompatible: stream classdesc serialVersionUID = -6451051318873184044, local class serialVersionUID = 583745679236071411
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
    at akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
    at akka.serialization.JavaSerializer.fromBinary(Serializer.scala:136)
    at akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104)
    at scala.util.Try$.apply(Try.scala:161)
    at akka.serialization.Serialization.deserialize(Serialization.scala:98)
    at akka.remote.serialization.MessageContainerSerializer.fromBinary(MessageContainerSerializer.scala:58)
    at akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104)
    at scala.util.Try$.apply(Try.scala:161)
    at akka.serialization.Serialization.deserialize(Serialization.scala:98)
    at akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:23)
    at akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:55)
    at akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:55)
    at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:73)
    at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:764)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

有什么能解决这个?
谢谢你在前进。

What could be the solution to this? Thank you in advance.

推荐答案

浪费了大量的时间后,我发现这个问题。
尽管我没有在我的应用程序,Hadoop的客户事务而使用的Hadoop / HDFS。问题是在Hadoop的客户端版本,它比的Hadoop版本不同,火花是专为。
星火公司的Hadoop版本1.2.1,但在我的应用程序,为2.4。

After wasting a lot of time, I've found the problem. Despite I haven't used hadoop/hdfs in my application, hadoop client matters. The problem was in hadoop-client version, it was different than the version of hadoop, spark was built for. Spark's hadoop version 1.2.1, but in my application that was 2.4.

当我在我的应用程序改变的Hadoop客户端的版本1.2.1,我能够在集群执行火花code。

When I changed the version of hadoop client to 1.2.1 in my app, I'm able to execute spark code on cluster.

这篇关于独立火花集群。不能以编程方式提交的工作 - > java.io.InvalidClassException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆