Scala 2.10 Maven中的Spark和MongoDB应用程序内置错误 [英] Spark and MongoDB application in Scala 2.10 maven built error

查看:53
本文介绍了Scala 2.10 Maven中的Spark和MongoDB应用程序内置错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为Spark和MongoDB构建一个具有maven依赖关系的Scala应用程序.我使用的Scala版本是2.10.我的pom看起来像这样(忽略了无关的部分):

I want to build a Scala application with maven dependencies for Spark and MongoDB. The Scala version I use is 2.10. My pom look like this (left out unrelevant parts):

<properties>
    <maven.compiler.source>1.6</maven.compiler.source>
    <maven.compiler.target>1.6</maven.compiler.target>
    <encoding>UTF-8</encoding>
    <scala.tools.version>2.10</scala.tools.version>
    <!-- Put the Scala version of the cluster -->
    <scala.version>2.10.5</scala.version>
</properties>

<!-- repository to add org.apache.spark -->
<repositories>
    <repository>
        <id>cloudera-repo-releases</id>
        <url>https://repository.cloudera.com/artifactory/repo/</url>
    </repository>
</repositories>

<build>
    <sourceDirectory>src/main/scala</sourceDirectory>
    <testSourceDirectory>src/test/scala</testSourceDirectory>
    <!-- <pluginManagement>  -->    
        <plugins>
            <plugin>
                <!-- see http://davidb.github.com/scala-maven-plugin -->
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.1.3</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                        <configuration>
                            <args>
                                <arg>-make:transitive</arg>
                                <arg>-dependencyfile</arg>
                                <arg>${project.build.directory}/.scala_dependencies</arg>
                            </args>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.13</version>
                <configuration>
                    <useFile>false</useFile>
                    <disableXmlReport>true</disableXmlReport>
                    <!-- If you have classpath issue like NoDefClassError,... -->
                    <!-- useManifestOnlyJar>false</useManifestOnlyJar -->
                    <includes>
                        <include>**/*Test.*</include>
                        <include>**/*Suite.*</include>
                    </includes>
                </configuration>
            </plugin>

            <!-- "package" command plugin -->
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.4.1</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    <!-- </pluginManagement>  -->       
</build>

<dependencies>  
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>${scala.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>1.6.1</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb.spark</groupId>
        <artifactId>mongo-spark-connector_2.10</artifactId>
        <version>1.1.0</version>

    </dependency>   
    <dependency>
        <groupId>org.mongodb.scala</groupId>
        <artifactId>mongo-scala-driver_2.11</artifactId>
        <version>1.1.1</version>
    </dependency>
</dependencies>

当我运行 mvn clean assembly:assembly 时,发生以下错误:

When I run mvn clean assembly:assembly, the following error occurs:

C:\Develop\workspace\SparkApplication>mvn clean assembly:assembly
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building SparkApplication 0.0.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ SparkApplication ---
[INFO] Deleting C:\Develop\workspace\SparkApplication\target
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building SparkApplication 0.0.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] >>> maven-assembly-plugin:2.4.1:assembly (default-cli) > package @ SparkA
pplication >>>
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ SparkAppli
cation ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory C:\Develop\workspace\SparkApplication
\src\main\resources
[INFO]
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ SparkApplicatio
n ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- scala-maven-plugin:3.1.3:compile (default) @ SparkApplication ---
[WARNING]  Expected all dependencies to require Scala version: 2.10.5
[WARNING]  xx.xxx.xxx:SparkApplication:0.0.1-SNAPSHOT requires scala version:
2.10.5
[WARNING]  com.twitter:chill_2.10:0.5.0 requires scala version: 2.10.4
[WARNING] Multiple versions of scala libraries detected!
[INFO] C:\Develop\workspace\SparkApplication\src\main\scala:-1: info: compiling
[INFO] Compiling 1 source files to C:\Develop\workspace\SparkApplication\target\
classes at 1477993255625
[INFO] No known dependencies. Compiling everything
[ERROR] error: bad symbolic reference. A signature in package.class refers to ty
pe compileTimeOnly
[INFO] in package scala.annotation which is not available.
[INFO] It may be completely missing from the current classpath, or the version o
n
[INFO] the classpath might be incompatible with the version used when compiling
package.class.
[ERROR] C:\Develop\workspace\SparkApplication\src\main\scala\com\examples\MainEx
ample.scala:33: error: Reference to method intWrapper in class LowPriorityImplic
its should not have survived past type checking,
[ERROR] it should have been processed and eliminated during expansion of an encl
osing macro.
[ERROR]     val count = sc.parallelize(1 to NUM_SAMPLES).map{i =>
[ERROR]                                ^
[ERROR] two errors found
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 10.363 s
[INFO] Finished at: 2016-11-01T10:40:58+01:00
[INFO] Final Memory: 20M/353M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.1.3:compi
le (default) on project SparkApplication: wrap: org.apache.commons.exec.ExecuteE
xception: Process exited with an error: 1(Exit value: 1) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please rea
d the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
xception

仅当添加 mongo-scala-driver_2.11 依赖项时,才会发生该错误.如果没有这种依赖关系,则将构建jar.我的代码当前是 Spark网站中的Pi-Estimation示例:

The error occurs only when adding the mongo-scala-driver_2.11 dependency. Without this dependency, the jar will be built. My code is currently the Pi-Estimation example from the Spark website:

val conf = new SparkConf()
            .setAppName("Cluster Application")
            //.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") 

val sc = new SparkContext(conf)


val count = sc.parallelize(1 to NUM_SAMPLES).map{i =>
  val x = Math.random()
  val y = Math.random()
  if (x*x + y*y < 1) 1 else 0
}.reduce(_ + _)
println("Pi is roughly " + 4.0 * count / NUM_SAMPLES)

我还尝试在每个 github问题中找到此标签的每个元素添加以下标记.虽然没有帮助.

I also tried adding the following tags to each element as I found this in some github issue. Did not help though.

    <exclusions>
        <exclusion>
            <!-- make sure wrong scala version is not pulled in -->
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
        </exclusion>
    </exclusions>         

如何解决此问题?MongoDB Scala驱动程序似乎是针对 Scala 2.11构建的,但Spark需要Scala 2.10.

How to fix this? The MongoDB Scala Driver seems to be built against Scala 2.11 but Spark requires Scala 2.10.

推荐答案

删除Mongo Scala驱动程序依赖项,该依赖项未针对Scala 2.10进行编译,因此不兼容.

Remove the Mongo Scala Driver dependency, its not compiled for Scala 2.10 and therefore not compatible.

好消息是 MongoDB Spark连接器是独立的连接器.它利用同步 Mongo Java驱动程序,因为Spark是为CPU密集型同步任务设计的.它被设计为遵循Spark习惯用法,并且是将MongoDB连接到Spark所需要的.

The good news is MongoDB Spark Connector is a standalone connector. It utilises the synchronous Mongo Java Driver because Spark is designed for CPU intensive synchronous tasks. It has been designed to follow Spark idioms and is all that is needed to connect MongoDB to Spark.

另一方面, Mongo Scala驱动程序是现代Scala约定的惯用语;所有IO都是完全异步的.这对于Web应用程序和提高单个计算机的可伸缩性非常有用.

On the other hand the Mongo Scala Driver is idiomatic to modern Scala conventions; all IO is fully asynchronous. This is great for web applications and improving the scalability of an individual machine.

这篇关于Scala 2.10 Maven中的Spark和MongoDB应用程序内置错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆