java.lang.ClassNotFoundException:com.datastax.spark.connector.rdd.partitioner.CassandraPartition [英] java.lang.ClassNotFoundException: com.datastax.spark.connector.rdd.partitioner.CassandraPartition
问题描述
我与Cassandra一起工作了一段时间,现在我正在尝试设置spark和spark-cassandra-connector。我正在Windows 10中使用IntelliJ IDEA(也是第一次使用IntelliJ IDEA和Scala)做到这一点。
I've been working with Cassandra for a little while and now I'm trying to setup spark and spark-cassandra-connector. I'm using IntelliJ IDEA to do that (first time with IntelliJ IDEA and Scala too) in Windows 10.
build.gradle
build.gradle
apply plugin: 'scala'
apply plugin: 'idea'
apply plugin: 'eclipse'
repositories {
mavenCentral()
flatDir {
dirs 'runtime libs'
}
}
idea {
project {
jdkName = '1.8'
languageLevel = '1.8'
}
}
dependencies {
compile group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.4.5'
compile group: 'org.apache.spark', name: 'spark-sql_2.11', version: '2.4.5'
compile group: 'org.scala-lang', name: 'scala-library', version: '2.11.12'
compile group: 'com.datastax.spark', name: 'spark-cassandra-connector_2.11', version: '2.5.0'
compile group: 'log4j', name: 'log4j', version: '1.2.17'
}
configurations.all {
resolutionStrategy {
force 'com.google.guava:guava:12.0.1'
}
}
compileScala.targetCompatibility = "1.8"
compileScala.sourceCompatibility = "1.8"
jar {
zip64 true
archiveName = "ModuleName.jar"
from {
configurations.compile.collect {
it.isDirectory() ? it : zipTree(it)
}
}
manifest {
attributes 'Main-Class': 'org.module.ModuelName'
}
exclude 'META-INF/*.RSA', 'META-INF/*.SF', 'META-INF/*.DSA'
}
ModuleName.scala
package org.module
import org.apache.spark.sql.SparkSession
import com.datastax.spark.connector._
import org.apache.spark.sql.types.TimestampType
object SentinelSparkModule {
case class Document(id: Int, time: TimestampType, data: String)
def main(args: Array[String]) {
val spark = SparkSession.builder
.master("spark://192.168.0.3:7077")
.appName("App")
.config("spark.cassandra.connection.host", "127.0.0.1")
.config("spark.cassandra.connection.port", "9042")
.getOrCreate()
//I'm trying it without [Document] since it throws 'Failed to map constructor parameter id in
//org.module.ModuleName.Document to a column of keyspace.table'
val documentRDD = spark.sparkContext
.cassandraTable/*[Document]*/("keyspace", "table")
.select()
documentRDD.take(10).foreach(println)
spark.stop()
}
}
我在 spark://192.168.0.3:7077上有正在运行的Spark Master 和那个主机的工作人员,但我没有尝试将提交
作为控制台中的已编译jar的工作,我只是想得到它
I have a running spark master at spark://192.168.0.3:7077 and a worker of that master, but I haven't tried to submit
the job as a compiled jar in the console, I'm just trying to get it to work in the IDE.
谢谢
推荐答案
Cassandra连接器jar需要添加到工人的类路径。一种方法是用所有必需的依赖项构建一个uber jar并提交给集群。
Cassandra connector jar needs to be added to the classpath of workers. One way to do this is to build an uber jar with all required dependencies and submit to the cluster.
此外,请确保更改作用域
生成文件的依赖性从编译
到提供的
Also, make sure you change the scope
of dependencies in you build file from compile
to provided
for all jars except the cassandra connector.
参考: https://reflectoring.io/maven-scopes-gradle-configurations/
这篇关于java.lang.ClassNotFoundException:com.datastax.spark.connector.rdd.partitioner.CassandraPartition的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!