将整个HDFS从一个群集转移到另一个群集 [英] Transferring whole HDFS from one Cluster to another

查看:134
本文介绍了将整个HDFS从一个群集转移到另一个群集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很多配置单元表存储在我的HDFS上的5个节点的测试集群中。数据应该在70 Gb * 3左右(重新拼接)。不,我想将整个设置转移到具有更多节点的不同环境中。两个群集之间的网络连接是不可能的。



事情是我没有太多时间使用新的群集,也没有可能使用其他测试环境来测试转移。所以我需要一个坚实的计划。 :)

我有什么选择?



如何以最少的配置传输配置单元设置努力在新的群集上?



是否可以将5个节点的hdfs导向器复制到新群集的5个节点,然后将其余节点添加到新的群集和启动平衡器?

解决方案

没有网络连接,它会很棘手! b
我会


  1. 将文件从HDFS复制到某种可移动存储设备(USB存储棒,外置HDD等等)

  2. 将存储移至新集群

  3. 将文件复制回HDFS

请注意,这不会保留元数据,例如文件创建/上次访问时间,更重要的是拥有权限和权限。



<这个过程的小规模测试应该非常简单。


如果您可以获得(甚至暂时)网络连接性下注如果是两个集群,那么 distcp 就是要走的路。它使用map reduce来平行转移,可能会节省大量时间。

I have lots of hive Tables stored in my HDFS on a Test Cluster with 5 nodes. The Data should be around 70 Gb * 3 (Replipication). No i want to transfer the whole setup to a different environment with much more nodes. A Network Connection between the two Clusters is not possible.

The thing is that i dont have much time with the new Cluster and also no possibilities to Test the Transfering with an other Test environment. Therefore i need a solid plan. :)

What options do i have?

How can i transfer the hive setup with a minimum of configuration effort on the new cluster?

Is it possible to just copy the hdfs directorys of the 5 Nodes to 5 Nodes of the new Cluster, then add the rest of the nodes to the new cluster and start the balancer?

解决方案

Without a network connection, it will be tricky!

I would

  1. Copy the files out of HDFS onto some kind of removable storage (USB stick, external HDD, etc.)
  2. Move the storage to the new cluster
  3. Copy the files back into HDFS

Note that this won't preserve metadata like file creation/last access time, and, more importantly, ownership and permissions.

Small-scale testing of this process should be pretty simple.

If you can get (even temporarily) network connectivity between the two clusters, then distcp would be the way to go. It uses map reduce to parallelise the transfers, potentially resulting in massive time savings.

这篇关于将整个HDFS从一个群集转移到另一个群集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆