S3 中每个目录的最大文件数 [英] Max files per directory in S3

查看:36
本文介绍了S3 中每个目录的最大文件数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有一百万张图片,是将它们存储在某个文件夹/子文件夹层次结构中还是直接将它们全部转储到一个存储桶中(没有任何文件夹)?

If I had a million images, would it be better to store them in some folder/sub-folder hierarchy or just dump them all straight into a bucket (without any folders)?

将所有图像转储到无层次结构的存储桶中会减慢 LIST 操作的速度吗?

Would dumping all the images into a hierarchy-less bucket slow down LIST operations?

动态创建文件夹和子文件夹以及设置它们的 ACL(以编程方式)是否有很大的开销?

Is there a significant overhead in creating folders and sub folders on the fly and setting up their ACLs (programatically speaking)?

推荐答案

S3 不尊重分层命名空间.每个存储桶仅包含许多从键到对象的映射(以及关联的元数据、ACL 等).

S3 doesn't respect hierarchical namespaces. Each bucket simply contains a number of mappings from key to object (along with associated metadata, ACLs and so on).

即使对象的键可能包含/",S3 也会将路径视为纯字符串并将所有对象放在平面命名空间中.

Even though your object's key might contain a '/', S3 treats the path as a plain string and puts all objects in a flat namespace.

根据我的经验,随着对象数量的增加,LIST 操作确实需要(线性)更长的时间,但这可能是 Amazon 服务器上所需的 I/O 增加的一个症状,并向下连接到您的客户端.

In my experience, LIST operations do take (linearly) longer as object count increases, but this is probably a symptom of the increased I/O required on the Amazon servers, and down the wire to your client.

然而,查找时间似乎并没有随着对象数量的增加而增加——它很可能是某种 O(1) 哈希表的实现——所以在同一个存储桶中拥有许多对象应该与正常情况下的小存储桶一样高效用法(即不是列表).

However, lookup times do not seem to increase with object count - it's most probably some sort of O(1) hashtable implementation on their end - so having many objects in the same bucket should be just as performant as small buckets for normal usage (i.e. not LISTs).

至于 ACL,可以在存储桶和每个单独的对象上设置授权.由于没有层次结构,它们是您唯一的两个选择.显然,如果您有数百万个文件,设置尽可能多的存储桶范围的授权将大大减少您的管理头痛,但请记住,您只能授予权限,而不能撤销它们,因此存储桶范围的授权应该是ACL 所有内容的最大子集.

As for the ACL, grants can be set on the bucket and on each individual object. As there is no hierarchy, they're your only two options. Obviously, setting as many bucket-wide grants will massively reduce your admin headaches if you have millions of files, but remember you can only grant permissions, not revoke them, so the bucket-wide grants should be the maximal subset of the ACL for all its contents.

我建议将其拆分为单独的存储桶:

I'd recommend splitting into separate buckets for:

  • 完全不同的内容 - 为图像、声音和其他数据设置单独的存储桶,使架构更加合理
  • 显着不同的 ACL - 如果您可以有一个存储桶,其中每个对象都接收特定的 ACL,或者两个存储桶具有不同的 ACL 且没有特定于对象的 ACL,请使用这两个存储桶.

这篇关于S3 中每个目录的最大文件数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆