如何将HDFS中托管的配置文件传递给Spark Application? [英] How to pass configuration file that hosted in HDFS to Spark Application?

查看:270
本文介绍了如何将HDFS中托管的配置文件传递给Spark Application?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Spark结构化流.另外,我正在使用Scala.我想将配置文件传递给我的Spark应用程序.此配置文件托管在HDFS中.例如;

I'm working with Spark Structured Streaming. Also, I'm working with Scala. I want to pass config file to my spark application. This configuration file hosted in HDFS. For example;

spark_job.conf(HOCON)

spark {
  appName: "",
  master: "",
  shuffle.size: 4 
  etc..
}

kafkaSource {
  servers: "",
  topic: "",
  etc..
}

redisSink {
  host: "",
  port: 999,
  timeout: 2000,
  checkpointLocation: "hdfs location",
  etc..
}

如何将其传递给Spark Application?如何在Spark中读取此文件(hosted HDFS)?

How can I pass it to Spark Application? How can I read this file(hosted HDFS) in Spark?

推荐答案

您可以通过以下方式从HDFS中读取HOCON配置:

You can read the HOCON config from HDFS in the following way:

import com.typesafe.config.{Config, ConfigFactory}
import java.io.InputStreamReader
import java.net.URI
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.hadoop.conf.Configuration

val hdfs: FileSystem = FileSystem.get(new URI("hdfs://"), new Configuration())

val reader = new InputStreamReader(hdfs.open(new Path("/path/to/conf/on/hdfs")))

val conf: Config = ConfigFactory.parseReader(reader)

您还可以将名称节点的URI传递给FileSystem.get(new URI("your_uri_here")),该代码仍会读取您的配置.

You can also pass the URI of your namenode to the FileSystem.get(new URI("your_uri_here")) and the code will still read your configuration.

这篇关于如何将HDFS中托管的配置文件传递给Spark Application?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆