Spark:目录中的附加属性 [英] Spark: additional properties in a directory
问题描述
我正在使用亚马逊 EMR 的 spark 1.5.0.我有多个属性文件需要在我的 spark-submit 程序中使用.我探索了 --properties-file
选项.但它允许您从单个文件导入属性.我需要从结构如下所示的目录中读取属性:
I am working with spark 1.5.0 an amazon's EMR. I have multiple properties file that I need to use in my spark-submit program. I explored the --properties-file
option. But it allows you to import properties from a single file. I need to read properties from a directory whose structure looks like :
├── AddToCollection
│ ├── query
│ ├── root
│ ├── schema
│ └── schema.json
├── CreateCollectionSuccess
│ ├── query
│ ├── root
│ ├── schema
│ └── schema.json
├── FeedCardUnlike
│ ├── query
│ ├── root
│ ├── schema
│ └── schema.json
在独立模式下,我可以通过指定本地系统中文件的位置来解决这个问题.但它在集群模式下不起作用,我将 jar 与 spark-submit 命令一起使用.我怎样才能在火花中做到这一点?
In standalone mode I can get away with this by specifying the location of the files in the local system. But it doesn't work in cluster mode where I'm using a jar with the spark-submit command. How can I do this in spark?
推荐答案
这适用于 Spark 1.6.1(我没有测试过早期版本)
This works on Spark 1.6.1 (I haven't tested earlier versions)
spark-submit 支持 --files
参数,该参数接受要与 JAR 文件一起提交给驱动程序的本地"文件的逗号分隔列表.
spark-submit supports the --files
argument that accepts a comma separated list of "local" files to be submitted along with your JAR file to the driver.
spark-submit \
--class com.acme.Main \
--master yarn \
--deploy-mode cluster \
--driver-memory 2g \
--executor-memory 1g \
--driver-class-path "./conf" \
--files "./conf/app.properties,./conf/log4j.properties" \
./lib/my-app-uber.jar \
"$@"
在这个例子中,我创建了一个 Uber JAR,它不包含任何属性文件.当我部署我的应用程序时,app.properties 和 log4j.properties 文件被放置在本地 ./conf 目录中.
In this example I have created an Uber JAR that does not contain any properties files. When I deploy my application the app.properties and log4j.properties files are placed into the local ./conf directory.
来自 SparkSubmitArguments 声明
--files 文件
逗号分隔的文件列表,放在每个执行器的工作目录中.
--files FILES
Comma-separated list of files to be placed in the working directory of each executor.
这篇关于Spark:目录中的附加属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!