在Spark 1.6的哪里可以找到jars文件夹? [英] Where can I find the jars folder in Spark 1.6?

查看:184
本文介绍了在Spark 1.6的哪里可以找到jars文件夹?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Spark下载页面中,如果我下载

From the Spark downloads page, if I download the tar file for v2.0.1, I see that it contains some jars that I find useful to include in my app.

如果我下载的是v1.6.2的 tar文件,相反,我在那儿找不到jars文件夹.我应该在该网站上使用其他包装类型吗?我目前正在选择默认值(为Hadoop 2.6预先构建).或者,在哪里可以找到这些Spark罐-我应该分别从 http://spark-packages.org ?

If I download the tar file for v1.6.2 instead, I don't find the jars folder in there. Is there an alternate package type I should use from that site? I am currently choosing the default (pre-built for Hadoop 2.6). Alternately, where I can find those Spark jars - should I get each of them individually from http://spark-packages.org?

以下是我要使用的一堆指示罐:

Here is an indicative bunch of jars I want to use:

  • hadoop-common
  • 火花芯
  • spark-csv
  • spark-sql
  • univocity-parsers
  • 火花催化剂
  • json4s-core

推荐答案

Sparks的运行时方式已从V1更改为V2.

The way Sparks ships its run-time has changed from V1 to V2.

  • 在V2中,默认情况下,您有多个个JAR $SPARK_HOME/jars
  • 在V1中,默认情况下,只有一个 $SPARK_HOME/lib下的大量spark-assembly*.jar 包含所有依赖项.
  • In V2, by default, you have multiple JARs under $SPARK_HOME/jars
  • In V1, by default, there was just one massive spark-assembly*.jar under $SPARK_HOME/lib that contains all the dependencies.

我相信您可以更改默认行为,但这需要您自己重新编译Spark ...

I believe you can change the default behavior, but that would require recompiling Spark on your own...

还有关于spark-csv的具体信息:

  • 在V2中,SparkSQL原生支持CSV文件格式
  • 在V1中,您必须从 Spark下载spark-csv(适用于Scala 2.10). -Packages.org commons-csv并来自 Commons.Apache.org 并将两个JAR都添加到您的CLASSPATH
    (在命令行上使用--jars,或者如果命令行由于某些原因无法运行,则使用prop spark.driver.extraClassPath +指令sc.addJar())
    ...以及语法也比较麻烦
  • In V2, the CSV file format is natively supported by SparkSQL
  • In V1, you have to download spark-csv (for Scala 2.10) from Spark-Packages.org plus commons-csv from Commons.Apache.org and add both JARs to your CLASSPATH
    (with --jars on command line, or with prop spark.driver.extraClassPath + instruction sc.addJar() if the command line does not work for some reason)
    ...and the syntax is more cumbersome, too


摘自Spark 2.1.x的香草$SPARK_HOME/bin/spark-class(已大大简化)


Excerpt from the vanilla $SPARK_HOME/bin/spark-class as of Spark 2.1.x (greatly simplified)

#查找Spark 罐子

  SPARK_JARS_DIR="${SPARK_HOME}/jars"
  LAUNCH_CLASSPATH="$SPARK_JARS_DIR/*"

从Spark 1.6.x起

And as of Spark 1.6.x

#查找装配罐

  ASSEMBLY_DIR="${SPARK_HOME}/lib"
  ASSEMBLY_JARS="$(ls -1 "$ASSEMBLY_DIR" | grep "^spark-assembly.*hadoop.*\.jar$" || true)"
  SPARK_ASSEMBLY_JAR="${ASSEMBLY_DIR}/${ASSEMBLY_JARS}"
  LAUNCH_CLASSPATH="$SPARK_ASSEMBLY_JAR"

这篇关于在Spark 1.6的哪里可以找到jars文件夹?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆