使用Hadoop NameNode,Checkpoint Node和Backup Node有什么优点和缺点? [英] What are the pros and cons of using the Hadoop NameNode, Checkpoint Node and Backup Node?

查看:151
本文介绍了使用Hadoop NameNode,Checkpoint Node和Backup Node有什么优点和缺点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



Hadoop文档说明:


Secondary NameNode has已被弃用。相反,请考虑使用检查点节点或备份节点


有三种选项 / em>和他们做什么,但我无法在哪些情况下找到有关三种选项中的哪一个被推荐的信息。基本上,检查点节点是辅助名称节点的一个新实现,备份点是临时释放namenode的热备份(加上它现在可以通过分离读取和写入来提供小的性能提升 - 读取名称节点并写入备份节点中)

从按照Konstantin Shvachko的解释, Backupnode文档


此修补程序引入了两种新类型的名称节点:一个Checkpoint节点和一个Backup节点。



  • 合并图像并编辑文件,检查点节点对检查点名称 - 节点元数据的作用。
  • 备份节点扩展功能
    通过它可以接收文件
    系统元数据的在线更新,将它们应用于其存储器状态并持续他们在
    磁盘上就像名称节点一样。因此,备份节点
    在任何时候都包含内存中和
    本地磁盘上的名称空间的最新映像。由于备份节点不需要从活动的
    名称节点传输文件,也不需要重放(合并)编辑,因此这也会使检查点
    更有效。

  • 术语备用节点
    被保留用于备份节点功能的进一步扩展,
    当群集将能够在
    活动器死亡时切换到新的名称节点。这在设计文档的温备用条款部分中提到。


典型用例:


  1. 仅运行检查点节点以创建检查点。这应该使用
    而不是当前的SecondaryNameNode,它由
    修补程序弃用。我重复使用了很多SecondaryNameNode代码,所以这项工作不会浪费,它只是演变而来。

  2. 运行备份节点以支持在线
    流编辑和高效检查点。特别是
    的目标是将NFS作为远程存储进行编辑。

  3. 运行NameNode
    而不用永久存储,并将所有持久
    功能委托给备份节点。这里的技巧是使用-importCheckpoint选项启动名称节点
    ,然后运行备份节点。


I'm currently evaluating Hadoop 1.0.2 for an in-house project.

The Hadoop docs say that

The Secondary NameNode has been deprecated. Instead, consider using the Checkpoint Node or Backup Node

There is information on what the three options are and what they do, but I'm having trouble finding information on which of the three options is recommended in which situations.

解决方案

Basically the checkpoint node is a new implementation of the secondary name node and the backup point is an interim release on the way to a warm-standby for the namenode (plus it can currently offer a small performance boost by separating reads and writes - reads in the name node and writes in the backup node

from the Backupnode documentation as explained by Konstantin Shvachko :

This patch introduces two new types of name-nodes: a Checkpoint node and a Backup node.

  • The role of the Checkpoint node to checkpoint name-node meta-data by merging image and edits files.
  • The Backup node extends functionality of the Checkpointer by that it can receive online updates of the file system meta-data, apply them to its memory state and persist them on disks just like the name-node does. Thus at any time the Backup node contains an up-to-date image of the namespace both in memory and on local disk(s). This also results in much more efficient checkpointing because backup node does not need to transfer files from the active name-node and does not need to replay (merge) edits.
  • The Term Standby node is reserved for further extension of the backup node functionality, when cluster will be able to switch over to the new name-node if the active dies. This is mentioned in the "Warm standby provision" section of the design document.

Typical use cases:

  1. Run Checkpoint node only to create checkpoints. This should be used instead of the current SecondaryNameNode, which is deprecated by the patch. I reused a lot of the SecondaryNameNode code so this effort was not wasted, it just evolved.
  2. Run Backup node to support online streaming of edits and efficient checkpointing. This particularly targets eliminating NFS as a remote storage for edits.
  3. Run NameNode without persistent storage at all and delegate all "persisting" functionality to the Backup node. The trick here is to start name-node with -importCheckpoint option and then run the Backup node.

这篇关于使用Hadoop NameNode,Checkpoint Node和Backup Node有什么优点和缺点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆