如何组织云存储中的大量对象? [英] How to organize large number of objects in cloud storage?
问题描述
我正在寻找有关如何组织大量对象的建议.
I'm looking for suggestions on how to organize large number of objects.
假设每天的传入速度约为6000万个文件,我希望将它们保留180天.
Assuming the incoming rate is about 60,000,000 files per day and I would like to keep them for 180 days.
按小时划分,顶层将有4320(24 * 180)个目录.每个目录平均包含约2,500,000个文件.
With hourly partition, there will be 4320 (24 * 180) directories at the top level. And each directory will contain ~2,500,000 files on average.
如果我只需要按文件的完整路径分别获取文件,而无需列出目录的内容,那么将所有2500000个文件都保留在同一级别是否有问题?
If I only need to fetch the files individually by its full path and I do not need to list the content of the directory, is there any issue with leaving all 2500000 of them in the same level?
还是我应该对文件名进行哈希处理并将它们存储在多个子目录中? (例如存储在传统文件系统中的通常操作)
Or should I hash the filenames and store them in multiple sub directories? (like how it's typically done if stored on a traditional file system)
推荐答案
存储桶中可以存储的对象数量没有限制,将对象拆分为更多的子目录"不会带来任何可伸缩性或性能差异.对于Google Cloud Storage服务,所有对象名称都是统一的:路径中的"/"看起来就像对象名称中的任何其他字符一样.
There's no limit on the number of objects you can store in a bucket, and breaking objects into more "subdirectories" doesn't make any scalability or performance difference. To the Google Cloud Storage service all object names are flat: the "/" in the path just looks like any other character in the object name.
Mike Schwartz,Google云存储团队
Mike Schwartz, Google Cloud Storage Team
这篇关于如何组织云存储中的大量对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!