Hadoop Namenode元数据-fsimage和编辑日志 [英] Hadoop Namenode Metadata - fsimage and edit logs

查看:313
本文介绍了Hadoop Namenode元数据-fsimage和编辑日志的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我了解fsimage会在启动时加载到内存中,并且出于性能原因,任何其他事务都将添加到编辑日志中,而不是添加到fsimage中.

I understand that the fsimage is loaded into the memory on startup and any further transactions are added to the edit log rather than to the fsimage for performance reasons.

重新启动namenode时,将刷新内存中的fsimage.为了提高效率,辅助名称节点定期执行检查点以更新fsimage,以便名称节点恢复更快.所有这些都很好.

The fsimage in memory gets refreshed when the namenode is restarted. For efficiency, secondary name node periodically does a checkpoint to update the fsimage so that the namenode recovery is faster. All these are fine.

但是我无法理解的一点是, 可以说一个文件已经存在,有关此文件的信息在内存中的fsimage中. 现在,我将此文件移动到其他位置,该位置在编辑日志中已更新. 现在,当我尝试列出旧文件路径时,它抱怨说它不存在或任何其他原因.

But one point which i fail to understand is this, Lets say that a file already exists and the info about this file is in the fsimage in memory. Now i move this file to a different location, which is updated in the edit log. Now when i try to list the old file path, it complains thats it does not exists or whatever.

这是否意味着namenode也会同时查看编辑日志,这与fsimage在内存中的用途相矛盾?或如何知道文件位置已更改?

Does this mean that namenode looks at the edit log as well which is contradictory to the purpose of the fsimage in memory? or how does it know that the file location has changed?

推荐答案

答案是通过查看编辑日志中的信息.如果编辑日志中没有可用信息,则当我们将新文件写入hdfs时,此问题在用例中正确.当您的namenode运行时,如果删除fsimage文件并尝试读取hdfs文件,则该文件可以读取.

Answer is by looking at information in the edit logs. If information is not available in the edit logs This question stands true for use-case when we write the new file to hdfs. While your namenode is running if you remove fsimage file and try to read the hdfs file it is able to read.

从正在运行的namenode中删除fsimage文件不会导致读/写操作出现问题.当我们重新启动namenode时,会出现错误,指出找不到图像文件.

Removing the fsimage file from the running namenode will not cause issue with the read / write operations. When we restart the namenode, there will be errors stating that image file is not found.

让我尝试提供更多解释以帮助您.

Let me try to give some more explanation to help you out.

仅在启动时hadoop会查找fsimage文件,如果该文件不存在,则不会出现namenode并记录格式化namenode的日志.

Only on start up hadoop looks fsimage file, in case if it is not there, namenode does not come up and log for formatting the namenode.

hadoop format -namenode命令创建fsimage文件(如果存在编辑日志).从编辑日志中获取namenode启动文件元数据之后(如果未找到,则通过fsimage文件搜索编辑日志中的信息).所以fsimage只是作为上次保存信息的检查点.这也是辅助节点与编辑日志保持同步(1小时/1百万次事务后)的原因之一,因此从最后一个检查点启动时,不需要太多同步.

hadoop format -namenode command creates fsimage file (if edit logs are present). After namenode startup file metadata is fetched from edit logs (and if not found information in edit logs searched thru fsimage file). so fsimage just works as checkpoint where inforamtion is saved last time. This is also one of the reason secondary node keeps on sync (after 1 hour / 1 milliion transactions) from edit logs so that on start up from last checkpoint not much needs to be synced.

如果您打开安全模式(命令:hdfs dfsadmin -safemode enter)并使用saveNamespace(命令:hdfs dfsadmin -saveNamespace),它将显示以下提到的日志消息.

2014-07-05 15:03:13,195 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Saving image file /data/hadoop-namenode-data-temp/current/fsimage.ckpt_0000000000000000169 using no compression
2014-07-05 15:03:13,205 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Image file /data/hadoop-namenode-data-temp/current/fsimage.ckpt_0000000000000000169 of size 288 bytes saved in 0 seconds.
2014-07-05 15:03:13,213 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Going to retain 2 images with txid >= 0
2014-07-05 15:03:13,237 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 170

这篇关于Hadoop Namenode元数据-fsimage和编辑日志的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆