hdfs如何选择一个datanode来存储 [英] how does hdfs choose a datanode to store
问题描述
如果文件太大,hdfs是否尝试将此文件的所有块存储在同一个节点或同一机架中的某个节点上?
hdfs是否提供任何API来让应用程序按照他喜欢的方式将文件存储在某个datanode中?
选择datanode的代码位于函数 ReplicationTargetChooser.chooseTarget()
。
评论说:
副本摆放策略是如果作者在
数据节点上,则第一个副本放置在本地机器上,否则为
a随机数据节点。第二个副本放置在
不同机架上的datanode上。第三个副本放置在
上的datanode上,与第一个副本相同。
它不会为应用程序提供任何API以将文件存储在他们想要的datanode中。
As the title indicates, when a client requests to write a file to the hdfs, how does the HDFS or name node choose which datanode to store the file? Does the hdfs try to store all the blocks of this file in the same node or some node in the same rack if it is too big? Does the hdfs provide any APIs for applications to store the file in a certain datanode as he likes?
The code for choosing datanode is in function ReplicationTargetChooser.chooseTarget()
.
The comment says that :
The replica placement strategy is that if the writer is on a datanode, the 1st replica is placed on the local machine, otherwise a random datanode. The 2nd replica is placed on a datanode that is on a different rack. The 3rd replica is placed on a datanode which is on the same rack as the first replica.
It doesn`t provide any API for applications to store the file in the datanode they want.
这篇关于hdfs如何选择一个datanode来存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!