在Spark 1.6的哪里可以找到jars文件夹? [英] Where can I find the jars folder in Spark 1.6?
问题描述
在 Spark下载页面中,如果我下载
From the Spark downloads page, if I download the tar file for v2.0.1, I see that it contains some jars that I find useful to include in my app.
如果我下载的是v1.6.2的 tar文件,相反,我在那儿找不到jars文件夹.我应该在该网站上使用其他包装类型吗?我目前正在选择默认值(为Hadoop 2.6预先构建).或者,在哪里可以找到这些Spark罐-我应该分别从 http://spark-packages.org ?
If I download the tar file for v1.6.2 instead, I don't find the jars folder in there. Is there an alternate package type I should use from that site? I am currently choosing the default (pre-built for Hadoop 2.6). Alternately, where I can find those Spark jars - should I get each of them individually from http://spark-packages.org?
以下是我要使用的一堆指示罐:
Here is an indicative bunch of jars I want to use:
- hadoop-common
- 火花芯
- spark-csv
- spark-sql
- univocity-parsers
- 火花催化剂
- json4s-core
推荐答案
Sparks的运行时方式已从V1更改为V2.
The way Sparks ships its run-time has changed from V1 to V2.
- 在V2中,默认情况下,您有多个个JAR
$SPARK_HOME/jars
- 在V1中,默认情况下,只有一个
$SPARK_HOME/lib
下的大量spark-assembly*.jar
包含所有依赖项.
- In V2, by default, you have multiple JARs under
$SPARK_HOME/jars
- In V1, by default, there was just one
massive
spark-assembly*.jar
under$SPARK_HOME/lib
that contains all the dependencies.
我相信您可以更改默认行为,但这需要您自己重新编译Spark ...
I believe you can change the default behavior, but that would require recompiling Spark on your own...
还有关于spark-csv
的具体信息:
- 在V2中,SparkSQL原生支持CSV文件格式
- 在V1中,您必须从 Spark下载
spark-csv
(适用于Scala 2.10). -Packages.org 加commons-csv
并来自 Commons.Apache.org 并将两个JAR都添加到您的CLASSPATH
(在命令行上使用--jars
,或者如果命令行由于某些原因无法运行,则使用propspark.driver.extraClassPath
+指令sc.addJar()
)
...以及语法也比较麻烦
- In V2, the CSV file format is natively supported by SparkSQL
- In V1, you have to download
spark-csv
(for Scala 2.10) from Spark-Packages.org pluscommons-csv
from Commons.Apache.org and add both JARs to your CLASSPATH
(with--jars
on command line, or with propspark.driver.extraClassPath
+ instructionsc.addJar()
if the command line does not work for some reason)
...and the syntax is more cumbersome, too
摘自Spark 2.1.x的香草
$SPARK_HOME/bin/spark-class
(已大大简化)
Excerpt from the vanilla
$SPARK_HOME/bin/spark-class
as of Spark 2.1.x (greatly simplified)
#查找Spark 罐子
SPARK_JARS_DIR="${SPARK_HOME}/jars"
LAUNCH_CLASSPATH="$SPARK_JARS_DIR/*"
从Spark 1.6.x起
And as of Spark 1.6.x
#查找装配罐
ASSEMBLY_DIR="${SPARK_HOME}/lib"
ASSEMBLY_JARS="$(ls -1 "$ASSEMBLY_DIR" | grep "^spark-assembly.*hadoop.*\.jar$" || true)"
SPARK_ASSEMBLY_JAR="${ASSEMBLY_DIR}/${ASSEMBLY_JARS}"
LAUNCH_CLASSPATH="$SPARK_ASSEMBLY_JAR"
这篇关于在Spark 1.6的哪里可以找到jars文件夹?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!