如何更改Spark程序的HDFS复制因子? [英] How can I change HDFS replication factor for my Spark program?

查看:122
本文介绍了如何更改Spark程序的HDFS复制因子?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要为我的Spark程序将HDFS复制因子从3更改为1.在搜索时,我想到了"spark.hadoop.dfs.replication"属性,但是通过查看

I need to change the HDFS replication factor from 3 to 1 for my Spark program. While searching, I came up with the "spark.hadoop.dfs.replication" property, but by looking at https://spark.apache.org/docs/latest/configuration.html, it doesn't seem to exist anymore. So, how can I change the hdfs replication factor from my Spark program or using spark-submit?

推荐答案

HDFDS配置并不以任何方式特定于Spark.您应该能够使用标准Hadoop配置文件对其进行修改.特别是hdfs-site.xml:

HDFDS configuration is not specific in any way to Spark. You should be able to modify it, using standard Hadoop configuration files. In particular hdfs-site.xml:

<property> 
  <name>dfs.replication<name> 
  <value>3<value> 
<property>

还可以使用SparkContext实例访问Hadoop配置:

It is also possible to access Hadoop configuration using SparkContext instance:

val hconf: org.apache.hadoop.conf.Configuration = spark.sparkContext.hadoopConfiguration
hconf.setInt("dfs.replication", 3)

这篇关于如何更改Spark程序的HDFS复制因子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆