Namenode文件数量限制 [英] Namenode file quantity limit

查看:1252
本文介绍了Namenode文件数量限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

任何一个知道hdfs的namenode中每个文件占用多少字节?
我想估计有多少文件可以存储在32G内存的单个namenode中。

解决方案

每个文件或目录或块在namenode内存中占用大约150个字节。 [1]因此,具有32G RAM的namenode的集群最多可以支持大约3800万个文件(假设namenode是瓶颈)。 (每个文件也会占用一个块,因此每个文件都需要300个字节,我也假设是3个复制,所以每个文件占用900个字节)



但是实践中,这个数字会小很多,因为所有的32G都不可用于namenode来保持映射。您可以通过为该机器的namenode分配更多堆空间来增加它。



复制也会对此产生较小程度的影响。每个附加副本将大约16个字节添加到内存要求中。 [2]

[1] http://www.cloudera.com/blog/2009/02/the-small-files-problem/



[2] http://search-hadoop.com/c/HDFS:/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java%7C%7CBlockInfo


Any one know how many bytes occupy per file in namenode of Hdfs? I want to estimate how many files can store in single namenode of 32G memory.

解决方案

Each file or directory or block occupies about 150 bytes in the namenode memory. [1] So a cluster with a namenode with 32G RAM can support a maximum of (assuming namenode is the bottleneck) about 38 million files. (Each file will also take up a block, so each file takes 300 bytes in effect. I am also assuming 3x replication. So each file takes up 900 bytes)

In practice however, the number will be much lesser because all of the 32G will not be available to the namenode for keeping the mapping. You can increase it by allocating more heap space to the namenode in that machine.

Replication also effects this to a lesser degree. Each additional replica adds about 16 bytes to the memory requirement. [2]

[1] http://www.cloudera.com/blog/2009/02/the-small-files-problem/

[2] http://search-hadoop.com/c/HDFS:/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java%7C%7CBlockInfo

这篇关于Namenode文件数量限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆