阿帕奇星火1.5与卡桑德拉:类转换异常 [英] Apache Spark 1.5 with Cassandra : Class cast exception

查看:1217
本文介绍了阿帕奇星火1.5与卡桑德拉:类转换异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用以下软件:


  

      
  1. 卡桑德拉2.1.9

  2.   
  3. 星火1.5

  4.   
  5. Java中使用由Datastax提供的卡桑德拉驱动程序。

  6.   
  7. Ubuntu的12.0.4

  8.   

当我在本地使用运行火花本地[8] ,程序运行良好,数据被保存到卡桑德拉。然而,当我提交作业引发集群,下面抛出异常:

  2015年9月16日03:08:58808警告[任务结果的getter-0](Logging.scala:71)TaskSetManager  - 舞台0.0(TID 3输给任务3.0,
192.168.50.131):java.lang.ClassCastException:不能分配scala.collection.immutable.HashMap $ SerializationProxy的实例scala.collection.immutable的实例字段类型scala.collection.Map的scala.collection.Map $ WithDefault.underlying .MAP $ WithDefault
        在java.io.ObjectStreamClass中的$ FieldReflector.setObjFieldValues​​(ObjectStreamClass.java:2083)
        在java.io.ObjectStreamClass.setObjFieldValues​​(ObjectStreamClass.java:1261)
        在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1996)
        在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        在java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        在java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        在java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        在java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        在java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        在org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
        在org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
        在org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        在org.apache.spark.scheduler.Task.run(Task.scala:88)
        在org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:214)
        在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        在java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)
        在java.lang.Thread.run(Thread.java:745)

我无能如何修复这个错误。我只用以下两个依赖条件:


  

      
  1. 火花组装1.5.0-hadoop2.6.0.jar - >星火自带的下载

  2.   
  3. 火花卡桑德拉 - 连接器的Java组件-1.5.0-M1-SNAPSHOT.jar - >使用SBT从Git的建立

  4.   

我已经远销我捆绑的应用水罐里的火花classpath中也。
请帮助,因为我不知道这是一个应用程序特定的错误或星火分布本身。

问题
解决方案

我发现这个问题终于。

问题是,我只是将我捆绑的应用罐(瓶脂肪)到火花上下文,并排除以下两个jar:


  

1。火花组装1.5.0-hadoop2.6.0.jar


  
  

2。火花卡桑德拉 - 连接器的Java组件-1.5.0-M1-SNAPSHOT.jar


原来,我还要补充的火花卡桑德拉 - 连接器的Java组件-1.5.0-M1-SNAPSHOT.jar 以火花背景下,只有排除在火花组装1.5.0-hadoop2.6.0.jar

I use the following softwares:

  1. Cassandra 2.1.9
  2. Spark 1.5
  3. Java using the Cassandra driver provided by Datastax.
  4. Ubuntu 12.0.4

When I run spark locally using local[8], the program runs fine and data is saved into Cassandra. However, when I submit the job to spark cluster, the following exception is thrown:

16 Sep 2015 03:08:58,808  WARN [task-result-getter-0] (Logging.scala:71) TaskSetManager - Lost task 3.0 in stage 0.0 (TID 3,
192.168.50.131): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.HashMap$SerializationProxy to field scala.collection.Map$WithDefault.underlying of type scala.collection.Map in instance of scala.collection.immutable.Map$WithDefault
        at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2083)
        at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1261)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1996)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
        at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:88)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

I am clueless how to fix this error. I use only the following 2 dependencies:

  1. spark-assembly-1.5.0-hadoop2.6.0.jar --> Comes with Spark download
  2. spark-cassandra-connector-java-assembly-1.5.0-M1-SNAPSHOT.jar --> Build from Git using sbt.

I have exported my bundled application jar into the spark classpath also. Kindly help as I am not sure if this is an application specific error or a problem with Spark distribution itself.

解决方案

I found out the issue finally.

The problem was that I was only adding my bundled Application jar (fat jar) into the spark context and excluded the following two jars:

1. spark-assembly-1.5.0-hadoop2.6.0.jar

2. spark-cassandra-connector-java-assembly-1.5.0-M1-SNAPSHOT.jar.

It turns out that I should also add spark-cassandra-connector-java-assembly-1.5.0-M1-SNAPSHOT.jar to the spark context and only exclude the spark-assembly-1.5.0-hadoop2.6.0.jar.

这篇关于阿帕奇星火1.5与卡桑德拉:类转换异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆