hdfs如何选择一个datanode来存储 [英] how does hdfs choose a datanode to store

查看:431
本文介绍了hdfs如何选择一个datanode来存储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如标题所示,当客户端请求将文件写入hdfs时,HDFS或名称节点如何选择要存储文件的datanode?
如果文件太大,hdfs是否尝试将此文件的所有块存储在同一个节点或同一机架中的某个节点上?
hdfs是否提供任何API来让应用程序按照他喜欢的方式将文件存储在某个datanode中?

解决方案

选择datanode的代码位于函数 ReplicationTargetChooser.chooseTarget()



评论说:


副本摆放策略是如果作者在
数据节点上,则第一个副本放置在本地机器上,否则为
a随机数据节点。第二个副本放置在
不同机架上的datanode上。第三个副本放置在
上的datanode上,与第一个副本相同。


它不会为应用程序提供任何API以将文件存储在他们想要的datanode中。


As the title indicates, when a client requests to write a file to the hdfs, how does the HDFS or name node choose which datanode to store the file? Does the hdfs try to store all the blocks of this file in the same node or some node in the same rack if it is too big? Does the hdfs provide any APIs for applications to store the file in a certain datanode as he likes?

解决方案

The code for choosing datanode is in function ReplicationTargetChooser.chooseTarget().

The comment says that :

The replica placement strategy is that if the writer is on a datanode, the 1st replica is placed on the local machine, otherwise a random datanode. The 2nd replica is placed on a datanode that is on a different rack. The 3rd replica is placed on a datanode which is on the same rack as the first replica.

It doesn`t provide any API for applications to store the file in the datanode they want.

这篇关于hdfs如何选择一个datanode来存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆