如何更改Spark程序的HDFS复制因子? [英] How can I change HDFS replication factor for my Spark program?
问题描述
我需要为我的Spark程序将HDFS复制因子从3更改为1.在搜索时,我想到了"spark.hadoop.dfs.replication"属性,但是通过查看
I need to change the HDFS replication factor from 3 to 1 for my Spark program. While searching, I came up with the "spark.hadoop.dfs.replication" property, but by looking at https://spark.apache.org/docs/latest/configuration.html, it doesn't seem to exist anymore. So, how can I change the hdfs replication factor from my Spark program or using spark-submit?
推荐答案
HDFDS配置并不以任何方式特定于Spark.您应该能够使用标准Hadoop配置文件对其进行修改.特别是hdfs-site.xml
:
HDFDS configuration is not specific in any way to Spark. You should be able to modify it, using standard Hadoop configuration files. In particular hdfs-site.xml
:
<property>
<name>dfs.replication<name>
<value>3<value>
<property>
还可以使用SparkContext
实例访问Hadoop配置:
It is also possible to access Hadoop configuration using SparkContext
instance:
val hconf: org.apache.hadoop.conf.Configuration = spark.sparkContext.hadoopConfiguration
hconf.setInt("dfs.replication", 3)
这篇关于如何更改Spark程序的HDFS复制因子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!