消耗的HDFS空间:"hdfs dfs -du/"对比"hdfs dfsadmin -report" [英] HDFS space consumed: "hdfs dfs -du /" vs "hdfs dfsadmin -report"
问题描述
哪种工具最适合衡量HDFS消耗的空间?
Which tool is the right one to measure HDFS space consumed?
当我总结"hdfs dfs -du/"的输出时,与"hdfs dfsadmin -report"(使用的DFS"行)相比,我总是得到较少的空间消耗.是否有du没有考虑的数据?
When I sum up the output of "hdfs dfs -du /" I always get less amount of space consumed compared to "hdfs dfsadmin -report" ("DFS Used" line). Is there data that du does not take into account?
推荐答案
Hadoop文件系统通过将数据副本放置到多个节点来提供重新标签存储.副本数是复制因子,通常大于1.
Hadoop file systems provides a relabel storage, by putting a copy of data to several nodes. The number of copies is replication factor, usually it is greate then one.
命令hdfs dfs -du /
显示不进行复制就消耗数据的空间.
Command hdfs dfs -du /
shows space consume your data without replications.
命令hdfs dfsadmin -report
(使用的DFS行)显示了实际磁盘使用情况,并考虑了数据复制.因此,从dfs -ud
命令获取数字时,它应该大几倍.
Command hdfs dfsadmin -report
(line DFS Used) shows actual disk usage, taking into account data replication. So it should be several times bigger when number getting from dfs -ud
command.
这篇关于消耗的HDFS空间:"hdfs dfs -du/"对比"hdfs dfsadmin -report"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!