如何在Spark Scala Maven项目中使用属性 [英] How to use properties in spark scala maven project

查看:93
本文介绍了如何在Spark Scala Maven项目中使用属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想显式地包含属性文件,并将其包含在spark代码中,而不是直接使用所有凭证直接在spark代码中进行硬编码。
我正在尝试以下方法,但无法执行,AppContext无法解决。
请指导我如何实现此目标。

i want to include properties file explicitly and include it in spark code , instead of hardcoding directly in spark code with all credentials. i am trying following approach but not able to do, AppContext is not able to be resolved. please guide me how to achieve this.

CASSANDRA_HOST1=127.0.0.133
CASSANDRA_PORT1=9042
CASSANDRA_USER1=usr1
CASSANDRA_PASS1=pas2



DataMigration.cassandra.keyspace1=demo2
DataMigration.cassandra.table1= data1

CASSANDRA_HOST2= 
CASSANDRA_PORT2=9042
CASSANDRA_USER2=usr2
CASSANDRA_PASS2=pas2

D.cassandra.keyspace2=kesp2
D.cassandra.table2= data2

DataMigration.DifferencedRecords.output.path1=C:/spark_windows_proj/File1.csv
DataMigration.DifferencedRecords.output.path2=C:/spark_windows_proj/File1.parquet

----------------------------------------------------------------------------------
DM.scala

import org.apache.spark.sql.SparkSession
import org.apache.hadoop.mapreduce.v2.app.AppContext

object Data_Migration {
  def main(args: Array[String]) {



    val host1: String = AppContext.getProperties().getProperty("CASSANDRA_HOST1")
    val port1 = AppContext.getProperties().getProperty("CASSANDRA_PORT1").toInt
    val keySpace1: String = AppContext.getProperties().getProperty("DataMigration.cassandra.keyspace1")
    val DataMigrationTableName1: String = AppContext.getProperties().getProperty("DataMigration.cassandra.table1")
    val username1: String = AppContext.getProperties().getProperty("CASSANDRA_USER1")
    val pass1: String = AppContext.getProperties().getProperty("CASSANDRA_PASS1")

     val host2: String = AppContext.getProperties().getProperty("CASSANDRA_HOST2")
       val port2 = AppContext.getProperties().getProperty("CASSANDRA_PORT2").toInt
    val keySpace2: String = AppContext.getProperties().getProperty("DataMigration.cassandra.keyspace2")
    val DataMigrationTableName2: String = AppContext.getProperties().getProperty("DataMigration.cassandra.table2")
    val username2: String = AppContext.getProperties().getProperty("CASSANDRA_USER2")
    val pass2: String = AppContext.getProperties().getProperty("CASSANDRA_PASS2")




     val Result_csv: String = AppContext.getProperties().getProperty("DataMigration.DifferencedRecords.output.path1")
      val Result_parquet: String = AppContext.getProperties().getProperty("DataMigration.DifferencedRecords.output.path2")



    val sc = AppContext.getSparkContext()


    val spark = SparkSession
                      .builder() .master("local")
                      .appName("ABC")
                      .config("spark.some.config.option", "some-value")
                      .getOrCreate()



    val df_read1 = spark.read
                       .format("org.apache.spark.sql.cassandra")
                       .option("spark.cassandra.connection.host",host1)
                       .option("spark.cassandra.connection.port",port1)
                       .option( "spark.cassandra.auth.username",username1)
                       .option("spark.cassandra.auth.password",pass1)
                       .option("keyspace",keySpace1)
                       .option("table",DataMigrationTableName1)
                       .load()


推荐答案

我宁愿通过传递<$ c来显式传递属性提交作业时,火花提交的$ c>-properties-file 选项。

I would rather pass the properties explicitly by passing the --properties-file option to the spark-submit when submitting the job.

AppContext不一定适用于所有提交类型,而传递配置文件应该在任何地方都可以使用。

The AppContext won't necessary work for all submission types, while passing config file should work everywhere.

编辑:对于不使用spark-submit的本地用法,您可以简单地使用标准的 Properties 类,从资源中加载它并获得对属性的访问。您只需要将属性文件放入 src / main / resources ,而不是包含在其中的 src / test / resources 类路径仅用于测试。代码类似于:

For local usage without spark-submit, you can simply use the standard Properties class, loading it from the resources and get access to properties. You only need to put property file into src/main/resources instead of src/test/resources that is included into classpath only for tests. The code is something like:

val props = new Properties
props.load(getClass.getClassLoader.getResourceAsStream("file.props"))

这篇关于如何在Spark Scala Maven项目中使用属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆