使用cassandra连接器在apache spark 2.0.2上运行作业时,无法初始化com.datastax.spark.connector.types.TypeConverter $类 [英] Could not initialize class com.datastax.spark.connector.types.TypeConverter$ while running job on apache spark 2.0.2 using cassandra connector

查看:268
本文介绍了使用cassandra连接器在apache spark 2.0.2上运行作业时,无法初始化com.datastax.spark.connector.types.TypeConverter $类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对以前从cassandra群集中提取的apache spark shell的数据集进行简单计数。为此,我创建了一个简单的maven项目,该项目创建了胖子罐,有一些依赖项:

I'm trying to run simple count on data set from apache spark shell that was previously fetched to my cassandra cluster. To do this I've created simple maven project that creates fat jar, there are my dependencies:

        <!-- https://mvnrepository.com/artifact/com.cloudera.sparkts/sparkts -->
    <dependency>
        <groupId>com.cloudera.sparkts</groupId>
        <artifactId>sparkts</artifactId>
        <version>0.4.1</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.10 -->
    <dependency>
        <groupId>com.datastax.spark</groupId>
        <artifactId>spark-cassandra-connector_2.10</artifactId>
        <version>2.0.0-M3</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>2.0.2</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/com.datastax.cassandra/cassandra-driver-core -->
    <dependency>
        <groupId>com.datastax.cassandra</groupId>
        <artifactId>cassandra-driver-core</artifactId>
        <version>3.0.0</version>
    </dependency>

我正在使用Spark Shell通过以下命令运行此罐子:

I'm runnig this jar using spark shell using this command:

spark-shell --jars Sensors-1.0-SNAPSHOT-jar-with-dependencies.jar --executor-memory 512M

加载所需的依赖项后,我试图在我的spark实例上运行给定的操作:

After loading required dependencies I'm trying to run given operation on my spark instance:

import pl.agh.edu.kis.sensors._
import com.datastax.spark.connector._
val test = new TestConnector(sc)
test.count()

这是我收到的错误:

    17/01/21 04:42:37 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.io.IOException: Exception during preparation of SELECT "coil_id", "event_time", "car_count", "insert_time" FROM "public"."traffic" WHERE token("coil_id") > ? AND token("coil_id") <= ?   ALLOW FILTERING: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.createStatement(CassandraTableScanRDD.scala:293)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.com$datastax$spark$connector$rdd$CassandraTableScanRDD$$fetchTokenRange(CassandraTableScanRDD.scala:307)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$19.apply(CassandraTableScanRDD.scala:335)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$19.apply(CassandraTableScanRDD.scala:335)
    at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
    at com.datastax.spark.connector.util.CountingIterator.hasNext(CountingIterator.scala:12)
    at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1763)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1134)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1134)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
    at org.apache.spark.scheduler.Task.run(Task.scala:86)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
    at com.datastax.spark.connector.types.TypeConverter$.<init>(TypeConverter.scala:116)
    at com.datastax.spark.connector.types.TypeConverter$.<clinit>(TypeConverter.scala)
    at com.datastax.spark.connector.types.BigIntType$.converterToCassandra(PrimitiveColumnType.scala:50)
    at com.datastax.spark.connector.types.BigIntType$.converterToCassandra(PrimitiveColumnType.scala:46)
    at com.datastax.spark.connector.types.ColumnType$.converterToCassandra(ColumnType.scala:229)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$13.apply(CassandraTableScanRDD.scala:282)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$13.apply(CassandraTableScanRDD.scala:282)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.createStatement(CassandraTableScanRDD.scala:282)
    ... 17 more
17/01/21 04:42:41 INFO CoarseGrainedExecutorBackend: Got assigned task 2
17/01/21 04:42:41 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
17/01/21 04:42:41 ERROR Executor: Exception in task 2.0 in stage 0.0 (TID 2)
java.io.IOException: Exception during preparation of SELECT "coil_id", "event_time", "car_count", "insert_time" FROM "public"."traffic" WHERE token("coil_id") > ? AND token("coil_id") <= ?   ALLOW FILTERING: Could not initialize class com.datastax.spark.connector.types.TypeConverter$
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.createStatement(CassandraTableScanRDD.scala:293)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.com$datastax$spark$connector$rdd$CassandraTableScanRDD$$fetchTokenRange(CassandraTableScanRDD.scala:307)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$19.apply(CassandraTableScanRDD.scala:335)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$19.apply(CassandraTableScanRDD.scala:335)
    at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
    at com.datastax.spark.connector.util.CountingIterator.hasNext(CountingIterator.scala:12)
    at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1763)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1134)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1134)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
    at org.apache.spark.scheduler.Task.run(Task.scala:86)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class com.datastax.spark.connector.types.TypeConverter$
    at com.datastax.spark.connector.types.BigIntType$.converterToCassandra(PrimitiveColumnType.scala:50)
    at com.datastax.spark.connector.types.BigIntType$.converterToCassandra(PrimitiveColumnType.scala:46)
    at com.datastax.spark.connector.types.ColumnType$.converterToCassandra(ColumnType.scala:229)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$13.apply(CassandraTableScanRDD.scala:282)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$13.apply(CassandraTableScanRDD.scala:282)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.createStatement(CassandraTableScanRDD.scala:282)
    ... 17 more
17/01/21 04:42:45 INFO CoarseGrainedExecutorBackend: Got assigned task 3
17/01/21 04:42:45 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
17/01/21 04:42:45 ERROR Executor: Exception in task 3.0 in stage 0.0 (TID 3)
java.io.IOException: Exception during preparation of SELECT "coil_id", "event_time", "car_count", "insert_time" FROM "public"."traffic" WHERE token("coil_id") > ? AND token("coil_id") <= ?   ALLOW FILTERING: Could not initialize class com.datastax.spark.connector.types.TypeConverter$
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.createStatement(CassandraTableScanRDD.scala:293)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.com$datastax$spark$connector$rdd$CassandraTableScanRDD$$fetchTokenRange(CassandraTableScanRDD.scala:307)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$19.apply(CassandraTableScanRDD.scala:335)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$19.apply(CassandraTableScanRDD.scala:335)
    at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
    at com.datastax.spark.connector.util.CountingIterator.hasNext(CountingIterator.scala:12)
    at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1763)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1134)
    at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1134)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
    at org.apache.spark.scheduler.Task.run(Task.scala:86)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class com.datastax.spark.connector.types.TypeConverter$
    at com.datastax.spark.connector.types.BigIntType$.converterToCassandra(PrimitiveColumnType.scala:50)
    at com.datastax.spark.connector.types.BigIntType$.converterToCassandra(PrimitiveColumnType.scala:46)
    at com.datastax.spark.connector.types.ColumnType$.converterToCassandra(ColumnType.scala:229)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$13.apply(CassandraTableScanRDD.scala:282)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$13.apply(CassandraTableScanRDD.scala:282)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD.createStatement(CassandraTableScanRDD.scala:282)
    ... 17 more

这是我的代码:

import com.datastax.spark.connector.japi.rdd.CassandraTableScanJavaRDD;
import org.apache.spark.SparkConf;
import org.apache.spark.SparkContext;

import static com.datastax.spark.connector.japi.CassandraJavaUtil.*;

/**
 * Created by Daniel on 20.01.2017.
 */
public class TestConnector {

    SparkContext sc;

    public TestConnector(SparkContext context){
        sc = context;
        sc.conf().set("spark.cassandra.connection.host","10.156.207.84")
                .set("spark.cores.max","1");
    }

    public TestConnector(){
        SparkConf conf = new SparkConf()
                .setMaster("local[*]")
                .set("spark.cassandra.connection.host","10.156.207.84")
                .set("spark.cores.max","1");
        sc = new SparkContext(conf);
    }

    public void count(){
        CassandraTableScanJavaRDD rdd  = javaFunctions(sc).cassandraTable("public","traffic");
        System.out.println("Total readings: " + rdd.count());
    }

}

标量版本:2.11.8,
Spark版本:2.0.2,
Cassandra版本:3.9,
Java版本:1.8.0_111

Scala version: 2.11.8, Spark version: 2.0.2, Cassandra version: 3.9, Java version: 1.8.0_111

推荐答案

您遇到的第一个问题似乎是Scala版本不匹配。 Spark 2.0的默认安装使用Scala 2.11,但已为所有依赖项指定2.10。将所有 _2.10 更改为 _2.11

The first problem you are running into looks like a Scala Version mismatch. The Default installation of Spark 2.0 uses Scala 2.11 but you have specified 2.10 for all of your dependencies. Change all the _2.10 to _2.11

接下来,您将遇到Guava不匹配的问题,因为您本来应该包括Cassandra驱动程序。因此,请删除对Java驱动程序的依赖。

Next you will run into a Guava Mismatch problem because you are including the Cassandra Driver when you should not be. So remove the dependency on the Java driver.

请参阅
https://github.com/datastax/spark-cassandra-connector/blob/master /doc/FAQ.md#how-do-i-fix-guava-classpath-errors

这篇关于使用cassandra连接器在apache spark 2.0.2上运行作业时,无法初始化com.datastax.spark.connector.types.TypeConverter $类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆