S3到Redshift:具有访问权限的复制 [英] S3 to Redshift : Copy with Access Denied

查看:142
本文介绍了S3到Redshift:具有访问权限的复制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们以前曾经每天使用COPY命令从无特定策略的存储桶中将文件从s3复制到Redshift.

We previously used to copy files from s3 to Redshift using the COPY command every day, from a bucket with no specific policy.

COPY schema.table_staging     
FROM 's3://our-bucket/X/YYYY/MM/DD/'     
CREDENTIALS 'aws_access_key_id=xxxxxx;aws_secret_access_key=xxxxxx'     
CSV     
GZIP     
DELIMITER AS '|'     
TIMEFORMAT 'YYYY-MM-DD HH24:MI:SS';  

当我们需要提高S3存储桶的安全性时,我们添加了一项策略来授权来自我们的VPC(用于Redshift集群的那个)或特定IP地址的连接.

As we needed to improve the security of our S3 bucket, we added a policy to authorize connections either from our VPC (the one we use for our Redshift cluster) or specific IP address.

{
"Version": "2012-10-17",
"Id": "S3PolicyId1",
"Statement": [
    {
        "Sid": "DenyAllExcept",
        "Effect": "Deny",
        "Principal": "*",
        "Action": "s3:*",
        "Resource": [
            "arn:aws:s3:::our-bucket/*",
            "arn:aws:s3:::our-bucket"
        ],
        "Condition": {
            "StringNotEqualsIfExists": {
                "aws:SourceVpc": "vpc-123456789"
            },
            "NotIpAddressIfExists": {
                "aws:SourceIp": [
                    "12.35.56.78/32"
                ]
            }
        }
    }
]
}

此策略非常适用于使用AWS CLI或boto Python库从EC2,EMR或我们的特定地址访问文件.

This policy works well for accessing files from EC2, EMR or our specific address using AWS CLI or the boto Python library.

这是Redshift上的错误:

Here is the error we have on Redshift :

ERROR: S3ServiceException:Access Denied,Status 403,Error AccessDenied,Rid xxxxxx,CanRetry 1
Détail : 
-----------------------------------------------
error:  S3ServiceException:Access Denied,Status 403,Error AccessDenied,Rid xxxxxx,CanRetry 1
code:      8001
context:   Listing bucket=our-bucket prefix=X/YYYY/MM/DD/
query:     1587954
location:  s3_utility.cpp:552
process:   padbmaster [pid=21214]
-----------------------------------------------

如果能为我们提供帮助,请多谢!

Many thanks in advance if you can help us on this,

达明(Damien)

ps:此问题与以下问题非常相似:

ps : this question is quite similar to this one : Copying data from S3 to Redshift - Access denied

推荐答案

您需要使用Redshift的增强的VPC路由"功能.从文档此处:

You need to use the 'Enhanced VPC Routing' feature of Redshift. From the documentation here:

当您使用Amazon Redshift增强型VPC路由时,Amazon Redshift会通过Amazon VPC强制集群和数据存储库之间的所有COPY和UNLOAD通信.

When you use Amazon Redshift Enhanced VPC Routing, Amazon Redshift forces all COPY and UNLOAD traffic between your cluster and your data repositories through your Amazon VPC.

  • 如果未启用增强型VPC路由,则Amazon Redshift将通过Internet路由流量 ,包括到AWS网络内其他服务的流量.

    If Enhanced VPC Routing is not enabled, Amazon Redshift routes traffic through the Internet, including traffic to other services within the AWS network.

  • 对于与群集位于同一区域的Amazon S3存储桶的流量,您可以创建一个VPC终端节点以将流量直接定向到该存储桶.

    For traffic to an Amazon S3 bucket in the same region as your cluster, you can create a VPC endpoint to direct traffic directly to the bucket.

  • 这篇关于S3到Redshift:具有访问权限的复制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆