Google文件系统中的块大小问题 [英] Chunk Size Issues in Google Filesystem

查看:162
本文介绍了Google文件系统中的块大小问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Google文件系统文件 -


块大小是关键设计
参数之一。我们选择了64 MB,
,这比典型的文件
系统块大得多。每个块
副本在块服务器上存储为一个普通的Linux
文件,并且仅在需要时才被扩展
。懒惰的空间分配
避免了由于内部
碎片造成的浪费空间,也许是对
这样大块大小的
最大的反对。


什么是懒惰空间分配,以及如何解决内部碎片问题?


一个小文件由一个
的小块组成,或许只有一块。
如果许多客户端
正在访问同一个文件,那么存储这些块
的大块服务器可能会成为热点...我们
通过存储这样的
可执行文件解决了这个问题更高的复制
因子,并通过使批处理队列
系统交错应用程序开始
次。 >什么是惊人的应用程序启动时间,以及如何避免块服务器成为热点?

解决方案

懒惰的空间分配手段文件系统在写入之前实际上并不提供文件空间。他们通常被称为稀疏文件。例如,如果只使用64MB块文件的前2MB,则实际上只能使用2MB的磁盘。



使应用程序的开始时间过长意味着他们不愿意使用它,马上开始一切。如果每个应用程序在启动时需要读取存储在GFS中的一些配置文件,如果 all 同时启动,则会出现加载问题。分发启动时间可以缓解这一点。

Google File System Paper -

Chunk size is one of the key design parameters. We have chosen 64 MB, which is much larger than typical file sys- tem block sizes. Each chunk replica is stored as a plain Linux file on a chunkserver and is extended only as needed. Lazy space allocation avoids wasting space due to internal fragmentation, perhaps the greatest objection against such a large chunk size.

What is lazy space allocation and how is it going to solve the internal fragmentation problem?

A small file consists of a small number of chunks, perhaps just one. The chunkservers storing those chunks may become hot spots if many clients are accessing the same file ... We fixed this problem by storing such executables with a higher replication factor and by making the batch- queue system stagger application start times.

What is staggering application start times and how does it avoid chunk-servers from becoming hot-spots?

解决方案

Lazy space allocation means the filesystem doesn't actually give the file space before it's written. They're commonly referred to as sparse files. For example, if only the first 2MB of the 64MB chunk file is used, only 2MB will actually be used on disk.

Staggering application start times just means that they don't start everything at once. If every application needs to read a few configuration files stored in GFS upon startup, if they all start at the same time, there will be load problems. Spreading out the startup times alleviates this.

这篇关于Google文件系统中的块大小问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆