Spark Kafka 流问题 [英] Spark Kafka Streaming Issue

查看:25
本文介绍了Spark Kafka 流问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 maven

我添加了以下依赖

 <依赖><!-- Spark 依赖--><groupId>org.apache.spark</groupId><artifactId>spark-streaming_2.10</artifactId><version>1.1.0</version></依赖><依赖><!-- Spark 依赖--><groupId>org.apache.spark</groupId><artifactId>spark-streaming-kafka_2.10</artifactId><version>1.1.0</version></依赖>

我也在代码中添加了jar

SparkConf sparkConf = new SparkConf().setAppName("KafkaSparkTest");JavaSparkContext sc = new JavaSparkContext(sparkConf);sc.addJar("/home/test/.m2/repository/org/apache/spark/spark-streaming-kafka_2.10/1.0.2/spark-streaming-kafka_2.10-1.0.2.jar");JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(5000));

它很好,没有任何错误,当我运行 spark-submit 时出现以下错误,非常感谢任何帮助.感谢您抽出宝贵时间.

bin/spark-submit --class "KafkaSparkStreaming" --master local[4] try/simple-project/target/simple-project-1.0.jar

<块引用>

线程main"中的异常 java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils在 KafkaSparkStreaming.sparkStreamingTest(KafkaSparkStreaming.java:40)在 KafkaSparkStreaming.main(KafkaSparkStreaming.java:23)在 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)在 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)在 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)在 java.lang.reflect.Method.invoke(Method.java:606)在 org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303)在 org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)在 org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)引起:java.lang.ClassNotFoundException:org.apache.spark.streaming.kafka.KafkaUtils在 java.net.URLClassLoader$1.run(URLClassLoader.java:366)

解决方案

我遇到了同样的问题,我通过构建带有依赖项的 jar 解决了它.

  1. 删除代码中的sc.addJar()".

  2. 将下面的代码添加到 pom.xml

    <sourceDirectory>src/main/java</sourceDirectory><testSourceDirectory>src/test/java</testSourceDirectory><插件><!--将 maven-assembly-plugin 绑定到 package 阶段这将创建一个没有风暴依赖项的 jar 文件适合部署到集群.--><插件><artifactId>maven-assembly-plugin</artifactId><配置><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs><存档><清单><mainClass></mainClass></清单></存档></配置><执行><执行><id>make-assembly</id><phase>包</phase><目标><目标>单</目标></目标></执行></执行></插件></plugins></build>

  3. mvn 包

  4. 提交example-jar-with-dependencies.jar"

I am using maven

i have added the following dependencies

   <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.10</artifactId>
      <version>1.1.0</version>
    </dependency>   <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kafka_2.10</artifactId>
      <version>1.1.0</version>
    </dependency>

I have also added the jar in the code

SparkConf sparkConf = new SparkConf().setAppName("KafkaSparkTest");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
sc.addJar("/home/test/.m2/repository/org/apache/spark/spark-streaming-kafka_2.10/1.0.2/spark-streaming-kafka_2.10-1.0.2.jar");
JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(5000)); 

It comples fine with out any error , i am getting the following error when i run through spark-submit, any help is much appreciated. Thanks for your time.

bin/spark-submit --class "KafkaSparkStreaming" --master local[4] try/simple-project/target/simple-project-1.0.jar

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils at KafkaSparkStreaming.sparkStreamingTest(KafkaSparkStreaming.java:40) at KafkaSparkStreaming.main(KafkaSparkStreaming.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtils at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

解决方案

I meet the same problem, I solved it by build the jar with dependencies.

  1. remove "sc.addJar()" in your code.

  2. add the code below to pom.xml

    <build>
        <sourceDirectory>src/main/java</sourceDirectory>
        <testSourceDirectory>src/test/java</testSourceDirectory>
        <plugins>
          <!--
                       Bind the maven-assembly-plugin to the package phase
            this will create a jar file without the storm dependencies
            suitable for deployment to a cluster.
           -->
          <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <configuration>
              <descriptorRefs>
                <descriptorRef>jar-with-dependencies</descriptorRef>
              </descriptorRefs>
              <archive>
                <manifest>
                  <mainClass></mainClass>
                </manifest>
              </archive>
            </configuration>
            <executions>
              <execution>
                <id>make-assembly</id>
                <phase>package</phase>
                <goals>
                  <goal>single</goal>
                </goals>
              </execution>
            </executions>
          </plugin>
        </plugins>
    </build>    
    

  3. mvn package

  4. submit the "example-jar-with-dependencies.jar"

这篇关于Spark Kafka 流问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆