在Hadoop中更改现有文件的块大小 [英] Change Block size of existing files in Hadoop

查看:109
本文介绍了在Hadoop中更改现有文件的块大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑在 hdfs-site.xml 中缺省块大小为64MB的hadoop群集。然而,稍后团队决定将其更改为128MB。以下是我对上述情况的疑问吗?


  1. 此更改是否需要重新启动群集,否则会自动占用并且所有新文件将拥有128MB的默认块大小?

  2. 块大小为64M的现有文件会发生什么情况?配置中的更改是否会自动应用于现有文件?如果它会自动完成,那么什么时候完成 - 一旦完成更改或集群启动时?如果没有自动完成,那么如何手动执行此块更改?


解决方案


此更改是否需要重新启动群集,否则它将自动占用
,所有新文件的默认块大小为
128MB



重新启动集群将需要此属性更改生效。


现有档案大小为64M的档案会发生什么情况?
配置中的更改是否会自动应用于现有文件

现有块不会更改其块尺寸。


如果没有自动完成,那么如何手动完成此块更改?


要更改现有文件,您可以使用distcp。它将复制具有新块大小的文件。但是,您将不得不手动删除旧块大小的旧文件。这里有一个你可以使用的命令

  hadoop distcp -Ddfs.block.size = XX / path / to / old / files /路径/到/新/文件/与/大/块/大小。 


Consider a hadoop cluster where the default block size is 64MB in hdfs-site.xml. However, later on the team decides to change this to 128MB. Here are my questions for the above scenario?

  1. Will this change require restart of the cluster or it will be taken up automatically and all new files will have the default block size of 128MB?
  2. What will happen to the existing files which have block size of 64M? Will the change in the configuration apply to existing files automatically? If it will be automatically done, then when will this be done - as soon as the change is done or when the cluster is started? If not automatically done, then how to manually do this block change?

解决方案

Will this change require restart of the cluster or it will be taken up automatically and all new files will have the default block size of 128MB

A restart of the cluster will be required for this property change to take effect.

What will happen to the existing files which have block size of 64M? Will the change in the configuration apply to existing files automatically?

Existing blocks will not change their block size.

If not automatically done, then how to manually do this block change?

To change the existing files you can use distcp. It will copy over the files with the new block size. However, you will have to manually delete the old files with the older block size. Here's a command that you can use

hadoop distcp -Ddfs.block.size=XX /path/to/old/files /path/to/new/files/with/larger/block/sizes.

这篇关于在Hadoop中更改现有文件的块大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆