授予数以万计的AWS账户访问存储桶? [英] Granting tens of thousands of AWS accounts access to a bucket?

查看:114
本文介绍了授予数以万计的AWS账户访问存储桶?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们是一家不起眼的创业公司,它从整个Internet挖掘数据,并将其放入Amazon S3存储桶中,以与世界共享.目前,我们有 2TB 数据,很快我们可能会达到 20TB 标记.

We are a humble startup that mines data from the entire Internet and put them in an Amazon S3 bucket to share with the world. For now we have 2TB of data and soon we may reach the 20TB mark.

我们的订户将能够从我们拥有的Amazon S3存储桶中下载所有数据.显然,我们必须为带宽选择请求者付费,除非我们最终要付出一些令人心碎的S3账单.

Our subscribers will be able to download all the data from the Amazon S3 bucket we have. We have to opt for requester pays for the bandwidth apparently unless we want to end up with some heart breaking S3 bills.

无法选择预签名URL,因为它似乎

Pre-signed URL is not an option because it doesn't seem to audit bandwidth usage in real time, thus is vulnerable to download abuses.

经过研究似乎是向不同的AWS账户授予访问存储桶所需权限的一种方式:

After some research this seems to be the way to grant different AWS accounts the needed permissions to access our bucket:

{
   "Version": "2012-10-17",
   "Statement": [
      {
         "Sid": "Permissions to foreign account 1",
         "Effect": "Allow",
         "Principal": {
            "AWS": "arn:aws:iam::ForeignAccount-ID-1:root"
         },
         "Action": [
            "s3:GetBucketLocation",
            "s3:ListBucket"
         ],
         "Resource": [
            "arn:aws:s3:::ourbucket"
         ]
      },
      {
         "Sid": "Permissions to foreign account 2",
         "Effect": "Allow",
         "Principal": {
            "AWS": "arn:aws:iam::ForeignAccount-ID-2:root"
         },
         "Action": [
            "s3:GetBucketLocation",
            "s3:ListBucket"
         ],
         "Resource": [
            "arn:aws:s3:::ourbucket"
         ]
      },
      {
         "Sid": "Permissions to foreign account 3",
         "Effect": "Allow",
         "Principal": {
            "AWS": "arn:aws:iam::ForeignAccount-ID-3:root"
         },
         "Action": [
            "s3:GetBucketLocation",
            "s3:ListBucket"
         ],
         "Resource": [
            "arn:aws:s3:::ourbucket"
         ]
      },

      ......

   ]
}

其中ForeignAccount-ID-x是帐户ID,例如2222-2222-2222.

Wherein ForeignAccount-ID-x is the account ID e.g. 2222-2222-2222.

但是,问题出在哪里,我们可能有成千上万甚至更多的订阅者使用此存储桶.

However the issue is, we may potentially have tens of thousands or even more subscribers to this bucket.

这是为他们添加访问此存储桶权限的正确而有效的方法吗?

Is this the right and efficient way to add permissions for them to access this bucket?

考虑到每个请求都会通过这个多山的存储桶策略,这会对存储桶造成性能上的困难吗?

Would it pose any performance difficulties to this bucket considering each request would go through this mountainous bucket policy?

有更好的解决方案吗?

推荐答案

您对 Amazon S3请求者支付存储桶是可以理解的,但会导致其他限制.

Your requirement for Amazon S3 Requester Pays Buckets is understandable, but leads to other limitations.

用户将需要他们自己的AWS帐户进行身份验证-它不适用于诸如AWS Cognito之类的联合登录.另外,预签名URL也无济于事,因为它们也是从AWS账户生成的.

User will need their own AWS account to authenticate — it will not work with federated logins such as AWS Cognito. Also, pre-signed URLs aren't of benefit because they are generated from an AWS account too.

存储桶策略限制为20KB,ACL限制为100个授权.

Bucket policies are limited to 20KB and ACLs are limited to 100 grants.

因此,这种方法似乎不太可行.

So, this approach seems unlikely to work.

另一种选择是创建一种机制,使您的系统可以 推送内容到另一个用户的AWS帐户.他们将需要提供目标存储桶和某种形式的访问权限(例如,可以假设为IAM角色),并且您的应用程序可以将文件复制到其存储桶中.但是,这对于定期发布的数据可能很难.

Another option would be to create a mechanism where your system can push content to another user's AWS account. They would need to provide a destination bucket and some form of access (eg an IAM Role that can be assumed) and your application could copy files to their bucket. However, this could be difficult for regularly-published data.

另一种选择是仅允许在同一AWS区域内访问内容 .因此,用户将能够使用诸如Amazon EMR之类的服务在AWS中读取和处理数据.他们可以在EC2上编写访问Amazon S3中数据的应用程序.他们将能够将数据复制到自己的存储桶中.他们唯一不能做的就是从AWS外部访问数据.这将消除数据传输成本.甚至可以在多个地区提供数据,以服务全球用户.

Another option would be to allow access to the content only from within the same AWS Region. Thus, users would be able to read and process the data in AWS using services such as Amazon EMR. They could write applications on EC2 that access the data in Amazon S3. They would be able to copy the data to their own buckets. The only thing they cannot do is access the data from outside AWS. This would eliminate Data Transfer costs. The data could even be provided in multiple regions to serve worldwide users.

最后一个选择是将您的数据集提交给AWS Public Dataset计划,支付公开可用的高价值云优化数据集"的存储和数据传输成本.

A final option would be to propose your dataset to the AWS Public Dataset Program, which will cover the cost of storage and data transfer for "publicly available high-value cloud-optimized datasets".

这篇关于授予数以万计的AWS账户访问存储桶?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆