“无法找到数据源:镶木地板”用maven制作一个胖罐子的时候 [英] "Failed to find data source: parquet" when making a fat jar with maven

查看:173
本文介绍了“无法找到数据源:镶木地板”用maven制作一个胖罐子的时候的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用maven程序集插件组装fat jar并遇到以下问题:

I am assembling the fat jar with maven assembly plugin and experience the following issue:

Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: parquet. Please find packages at http://spark-packages.org
    at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:145)
    at org.apache.spark.sql.execution.datasources.DataSource.providingClass$lzycompute(DataSource.scala:78)
    at org.apache.spark.sql.execution.datasources.DataSource.providingClass(DataSource.scala:78)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:310)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
    at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:427)
    at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:411)
    at org.apache.spark.mllib.classification.impl.GLMClassificationModel$SaveLoadV1_0$.loadData(GLMClassificationModel.scala:77)
    at org.apache.spark.mllib.classification.LogisticRegressionModel$.load(LogisticRegression.scala:183)
    at org.apache.spark.mllib.classification.LogisticRegressionModel.load(LogisticRegression.scala)
    at my.test.spark.assembling.TopicClassifier.load(TopicClassifier.java:35)
    at my.test.spark.assembling.Main.main(Main.java:23)
Caused by: java.lang.ClassNotFoundException: parquet.DefaultSource
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:130)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:130)
    at scala.util.Try$.apply(Try.scala:192)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:130)
    at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:130)
    at scala.util.Try.orElse(Try.scala:84)
    at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:130)
    ... 11 more

这是pom.xml:

<groupId>my.test.spark</groupId>
<artifactId>assembling</artifactId>
<version>1.0-SNAPSHOT</version>

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>


<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_2.11</artifactId>
        <version>2.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.0.0</version>
    </dependency>
</dependencies>

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <configuration>
                <source>1.8</source>
                <target>1.8</target>
            </configuration>
        </plugin>

        <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <configuration>
                <descriptorRefs>
                    <descriptorRef>jar-with-dependencies</descriptorRef>
                </descriptorRefs>
            </configuration>
        </plugin>
    </plugins>
</build>

如果我在Intellij中运行它IDEA问题不会发生。

If I run it in Intellij IDEA the problem doesn't occur.

我还应该在jar中包含哪些内容才能找到课程?

What else should I include to the jar to be able to find the class?

推荐答案

我找到了问题的解决方案。我尝试使用 sbt assembly 构建包,并遇到了不同但相关的问题。我在这里找到的解决方案: https://stackoverflow.com/a/27532248/5520896 也有助于我的原始问题。

I found the solution to the problem. I tried to build the package with sbt assembly and experienced different but related problem. The solution I found here: https://stackoverflow.com/a/27532248/5520896 also helps with my original issue.

所以解决问题的方法是从maven程序集插件转移到maven shade插件并应用转换器

So what solves the problem is moving from maven assembly plugin to maven shade plugin and apply the transformer

<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>

所以我的最终pom.xml插件配置如下:

So my final pom.xml plugin configuration is the following:

        <plugin>
            <artifactId>maven-shade-plugin</artifactId>
            <version>2.4.1</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                    <configuration>
                        <createDependencyReducedPom>false</createDependencyReducedPom>

                        <filters>
                            <filter>
                                <artifact>*:*</artifact>
                                <excludes>
                                    <exclude>META-INF/*.SF</exclude>
                                    <exclude>META-INF/*.DSA</exclude>
                                    <exclude>META-INF/*.RSA</exclude>
                                </excludes>
                            </filter>
                        </filters>
                        <transformers>                                
                            <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                        </transformers>
                    </configuration>
                </execution>
            </executions>
        </plugin>

这里解释了maven程序集出现的问题: https://stackoverflow.com/a/21118824/5520896

Apparently what goes wrong with maven assembly is explained here: https://stackoverflow.com/a/21118824/5520896

这篇关于“无法找到数据源:镶木地板”用maven制作一个胖罐子的时候的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆