亚马逊AWS S3的目录结构效率 [英] Amazon AWS S3 directory structure efficiency

查看:237
本文介绍了亚马逊AWS S3的目录结构效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个是通过我的脑海运行一个简单的效率问题。

I have a simple efficiency problem that is running through my mind.

我创建了一个PHP code表示上传我的文件夹中的所有文件到我的水桶在Amazon S3。我的code的上传在子文件的文件也不会丧失其结构的能力。

I have created a PHP code that uploads all files in my folders to my bucket on Amazon S3. My code has the ability to upload files in subfiles too without loosing its structure.

基本上,用户必须登录到我的网站,然后根据他们可以将照片上传到我的水桶在亚马逊S3用户的帐户名。用户最多可上传10张照片 - 这是再修改,以子文件类型如修改和缩略图。

Basically, a user has to logon to my website and then according to the user's account name they can upload photos to my bucket on Amazon s3. The user can upload up to 10 photos - these are then modified to sub file types e.g. modified and thumbnails.

我应该如何上传结构我的目录是有效的在Amazon S3?

How should I upload structure my directory to be efficient on Amazon S3?

选项1(在同一个桶的文件,但不同的文件夹 - 更有条理)

username/originalfiles/picture01.jpg
username/original/picture02.jpg
username/original/picture03.jpg
....
username/original/picture10.jpg


username/modifiedpicture01.jpg
username/modified/picture02.jpg
username/modified/picture03.jpg
....
username/modified/picture10.jpg


username/thumbailspicture01.jpg
username/thumbails/picture02.jpg
username/thumbails/picture03.jpg
....
username/thumbails/picture10.jpg

选项2(在同一个桶中的所有文件)

username-original-picture01.jpg
username-original-picture02.jpg
username-original-picture03.jpg
....
username-original-picture10.jpg


username-modifiedpicture01.jpg
username-modified-picture02.jpg
username-modified-picture03.jpg
....
username-modified-picture10.jpg


username-thumbailspicture01.jpg
username-thumbails-picture02.jpg
username-thumbails-picture03.jpg
....
username-thumbails-picture10.jpg

或者没有这使Amazon S3的有什么不同?

Or doesn't it make any different in Amazon S3?

推荐答案

这不会使组织目的的不同,S3文件夹其实只是一种错觉人类像我们这样的好处,这样似乎熟悉的 - 真的没有物理上独立的文件夹,好像有在自己的机器。

It doesn't make a difference for organizational purposes, S3 folders are really just an illusion for the benefit of humans like us so that it seems familiar - there really are no physically separate folders like there are on your own machine.

您使用,但是会对性能产生巨大的影响,一旦你到了某一点的命名约定(适用于小的文件,它可能不会是显着的)。

The naming convention you use however will have a tremendous impact on performance, once you get to a certain point (for small number of files, its probably not going to be noticeable).

在一般情况下,你希望你的文件/文件夹名的开头部分是随机杂交,越乱越好......这样S3可以分散工作量更好。如果名称prefixes都是相同的,将有一个潜在的瓶颈。短随机哈希在每个文件名开头的是可能给你最好的表现。

In general, you want the beginning part of your file/folder names to be 'random-ish', the more random the better...so that s3 can disperse the workload better. If the name prefixes are all the same, there will be a potential bottleneck. A short random hash at the beginning of each filename would be probably give you the best performance.

从马权(AWS)口:

在键名的序列模式引入了一个性能问题。要了解这个问题,让我们来看看亚马逊S3如何存储   按键的名称。

The sequence pattern in the key names introduces a performance problem. To understand the issue, let’s look at how Amazon S3 stores key names.

Amazon S3的维护对象键名,在每个AWS地区的指数。   对象密钥字典顺序存储在多个分区   索引。也就是说,按字母顺序排列的Amazon S3存储键名。   关键的名称决定了密钥存储在哪个分区上。使用   连续preFIX,如时间戳或按字母顺序,   增加的可能性,亚马逊S3将针对特定的   分区进行了大量的按键,铺天盖地的I / O   分区的容量。如果你介绍一些随机性的   键名prefixes,键名,因此I / O负载,将   分布在一个以上的分区。

Amazon S3 maintains an index of object key names in each AWS region. Object keys are stored lexicographically across multiple partitions in the index. That is, Amazon S3 stores key names in alphabetical order. The key name dictates which partition the key is stored in. Using a sequential prefix, such as timestamp or an alphabetical sequence, increases the likelihood that Amazon S3 will target a specific partition for a large number of your keys, overwhelming the I/O capacity of the partition. If you introduce some randomness in your key name prefixes, the key names, and therefore the I/O load, will be distributed across more than one partition.

如果你预计你的工作量将不断超越100   每秒请求,你应该避免连续的键名。如果你   必须使用连续数字或日期和时间模式的键名,   添加一个随机preFIX的键名。的preFIX的随机性更多   均匀地分布在多个索引分区键名。   本主题后面提供了引入随机性的例子。

If you anticipate that your workload will consistently exceed 100 requests per second, you should avoid sequential key names. If you must use sequential numbers or date and time patterns in key names, add a random prefix to the key name. The randomness of the prefix more evenly distributes key names across multiple index partitions. Examples of introducing randomness are provided later in this topic.

<一个href="http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html">http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html

这篇关于亚马逊AWS S3的目录结构效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆