为什么spark-submit会失败并显示“无法为数据源加载类:org.apache.spark.sql.cassandra”?在--jars中使用Cassandra连接器? [英] Why does spark-submit fail with "Failed to load class for data source: org.apache.spark.sql.cassandra" with Cassandra connector in --jars?
问题描述
Spark版本:1.4.1
Spark version: 1.4.1
Cassandra版本:2.1.8
Cassandra Version: 2.1.8
Datastax Cassandra连接器:1.4 .2-SNAPSHOT.jar
Datastax Cassandra Connector: 1.4.2-SNAPSHOT.jar
我运行的命令
./ spark-提交--jars /usr/local/src/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.4.2-SNAPSHOT.jar-驱动程序类路径/usr/local/src/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.4.2-SNAPSHOT.jar- -jars /usr/local/lib/spark-1.4.1/external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.4.1.jar --jars / usr / local / lib / spark- 1.4.1 / external / kafka-assembly / target / scala-2.10 / spark-streaming-kafka-assembly_2.10-1.4.1.jar --driver-class-path /usr/local/lib/spark-1.4.1 /external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.4.1.jar --driver-class-path /usr/local/lib/spark-1.4.1/external/kafka-assembly/ target / scala-2.10 / spark-streaming-kafka-as sembly_2.10-1.4.1.jar-包org.apache.spark:spark-streaming-kafka_2.10:1.4.1 --executor内存6g --executor-cores 6 --master local [4] kafka_streaming。 py
./spark-submit --jars /usr/local/src/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.4.2-SNAPSHOT.jar --driver-class-path /usr/local/src/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.4.2-SNAPSHOT.jar --jars /usr/local/lib/spark-1.4.1/external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.4.1.jar --jars /usr/local/lib/spark-1.4.1/external/kafka-assembly/target/scala-2.10/spark-streaming-kafka-assembly_2.10-1.4.1.jar --driver-class-path /usr/local/lib/spark-1.4.1/external/kafka/target/scala-2.10/spark-streaming-kafka_2.10-1.4.1.jar --driver-class-path /usr/local/lib/spark-1.4.1/external/kafka-assembly/target/scala-2.10/spark-streaming-kafka-assembly_2.10-1.4.1.jar --packages org.apache.spark:spark-streaming-kafka_2.10:1.4.1 --executor-memory 6g --executor-cores 6 --master local[4] kafka_streaming.py
以下是我遇到的错误:
Py4JJavaError: An error occurred while calling o169.save.
: java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra
必须做一些愚蠢的事情。任何响应将不胜感激。
Must be doing something silly. Any response will be appreciated.
推荐答案
请尝试在同一--jars选项中提供所有jar(逗号分隔):
Try to provide all your jars in the same --jars option (comma-separated) :
--jars yourFirstJar.jar,yourSecondJar.jar
一个更方便的开发解决方案是使用Maven Central的jar(以逗号分隔):
A more convenient solution for development purpose would be to use the jars from maven central (comma-separated) :
--packages org.apache.spark:spark-streaming-kafka_2.10:1.4.1,com.datastax.spark:spark-cassandra-connector_2.10:1.4.1
这篇关于为什么spark-submit会失败并显示“无法为数据源加载类:org.apache.spark.sql.cassandra”?在--jars中使用Cassandra连接器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!