spark + sbt-assembly:“去重:在下面找到不同的文件内容" [英] spark + sbt-assembly: "deduplicate: different file contents found in the following"

查看:19
本文介绍了spark + sbt-assembly:“去重:在下面找到不同的文件内容"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我运行了 spark 应用程序并想将测试类打包到 fat jar 中.奇怪的是我成功地运行了sbt assembly",但是当我运行sbt test:assembly"时却失败了.

I ran spark application and wanna pack the test classes into the fat jar. What is weird is I ran "sbt assembly" successfully, but failed when I ran "sbt test:assembly".

我尝试了 sbt-assembly : 包括测试类,但没有成功对于我的情况.

I tried sbt-assembly : including test classes, it didn't work for my case.

SBT 版本:0.13.8

SBT version : 0.13.8

build.sbt:

import sbtassembly.AssemblyPlugin._

name := "assembly-test"

version := "1.0"

scalaVersion := "2.10.5"

libraryDependencies ++= Seq(
  ("org.apache.spark" % "spark-core_2.10" % "1.3.1" % Provided)
    .exclude("org.mortbay.jetty", "servlet-api").
    exclude("commons-beanutils", "commons-beanutils-core").
    exclude("commons-collections", "commons-collections").
    exclude("commons-logging", "commons-logging").
    exclude("com.esotericsoftware.minlog", "minlog").exclude("com.codahale.metrics", "metrics-core"),
  "org.json4s" % "json4s-jackson_2.10" % "3.2.10" % Provided,
  "com.google.inject" % "guice" % "4.0"
)

Project.inConfig(Test)(assemblySettings)

推荐答案

您必须在程序集中定义 mergeStratey,就像我在下面为我的 spark 应用程序所做的那样.

You will have to define mergeStratey in assembly, as what I did for my spark app below.

mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
  {
    case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
    case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
    case PathList("org", "apache", xs @ _*) => MergeStrategy.last
    case PathList("com", "google", xs @ _*) => MergeStrategy.last
    case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
    case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
    case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
    case "about.html" => MergeStrategy.rename
    case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
    case "META-INF/mailcap" => MergeStrategy.last
    case "META-INF/mimetypes.default" => MergeStrategy.last
    case "plugin.properties" => MergeStrategy.last
    case "log4j.properties" => MergeStrategy.last
    case x => old(x)
  }
}

这篇关于spark + sbt-assembly:“去重:在下面找到不同的文件内容"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆