使用S3和aws_s3将Postgres数据导入RDS [英] Import Postgres data into RDS using S3 and aws_s3

查看:102
本文介绍了使用S3和aws_s3将Postgres数据导入RDS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难将数据从S3导入RDS postgres实例.根据文档,您可以使用此语法:

I'm having a hard time importing data from S3 into an RDS postgres instance. According to the docs, you can use this syntax:

aws_s3.table_import_from_s3 (
   table_name text, 
   column_list text, 
   options text, 
   bucket text, 
   file_path text, 
   region text, 
   access_key text, 
   secret_key text, 
   session_token text 
) 

因此,在pgAdmin中,我这样做了:

So, in pgAdmin, I did this:

SELECT aws_s3.table_import_from_s3(
  'contacts_1', 
  'firstname,lastname,imported', 
  '(format csv)',
  'com.foo.mybucket', 
  'mydir/subdir/myfile.csv', 
  'us-east-2',
  'AKIAYYXUMxxxxxxxxxxx',
  '3zB4S5jb1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
);

我也尝试使用最后一个参数的显式NULL.

I also tried it with an explicit NULL for the last parameter.

我收到的错误消息是:

NOTICE:  CURL error code: 51 when attempting to validate pre-signed URL, 1 attempt(s) remaining
NOTICE:  CURL error code: 51 when attempting to validate pre-signed URL, 0 attempt(s) remaining

ERROR:  Unable to generate pre-signed url, look at engine log for details.
SQL state: XX000

我检查了服务器日志,没有进一步的信息.

I checked the server logs and there was no further information.

我已经对所有参数的正确性进行了三重检查.我该如何进行这项工作?

I have triple-checked the correctness of all the parameters. How do I make this work?

更新:

我可以确认可以使用这些相同的凭据在Java aws sdk中执行s3.getObject().

I can confirm that I can do an s3.getObject() in the Java aws sdk using these same credentials.

推荐答案

此处的主要问题是,您需要1)将IAM角色添加到RDS实例以访问S3存储桶,以及2)将S3端点添加到运行RDS实例以允许通信的VPC.

The main issue here is that you need to 1) add a IAM role to the RDS instance to access the S3 bucket and 2) add an S3 endpoint to the VPC where the RDS instance run in order to allow communications.

这是我在外壳程序中使用AWS cli命令(使其正确处理所涉及的环境变量的值)以使其正常运行时所遵循的过程,希望它可以对您有所帮助:

This is the procedure I followed to make it work, using AWS cli commands in a shell (take care of value properly the environmental variables involved), hope it can help:

  1. 创建IAM角色:

$ aws iam create-role \
    --role-name $ROLE_NAME \
    --assume-role-policy-document '{"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Principal": {"Service": "rds.amazonaws.com"}, "Action": "sts:AssumeRole"}]}'

  1. 创建将附加到IAM角色的IAM策略:

$ aws iam create-policy \
    --policy-name $POLICY_NAME \
    --policy-document '{"Version": "2012-10-17", "Statement": [{"Sid": "s3import", "Action": ["s3:GetObject", "s3:ListBucket"], "Effect": "Allow", "Resource": ["arn:aws:s3:::${BUCKET_NAME}", "arn:aws:s3:::${BUCKET_NAME}/*"]}]}'

  1. 附加政策:

$ aws iam attach-role-policy \
    --policy-arn arn:aws:iam::$AWS_ACCOUNT_ID:policy/$POLICY_NAME \
    --role-name $ROLE_NAME

  1. 将角色添加到特定实例-每个新实例都需要重复此步骤:

$ aws rds add-role-to-db-instance \
    --db-instance-identifier $RDS_INSTANCE_NAME \
    --feature-name s3Import \
    --role-arn arn:aws:iam::$AWS_ACCOUNT_ID:role/$ROLE_NAME \
    --region $REGION

  1. 为S3服务创建VPC端点:

$ aws ec2 create-vpc-endpoint \
    --vpc-id $VPC_ID \
    --service-name com.amazonaws.$REGION.s3
    --route-table-ids $ROUTE_TABLE_ID

可以通过命令检索与创建端点的VPC相关的路由表ID

The route table id related to the VPC where the endpoint is created can be retrieved through the command

$ aws ec2 describe-route-tables | jq -r '.RouteTables[] | "\(.VpcId) \(.RouteTableId)"'

这篇关于使用S3和aws_s3将Postgres数据导入RDS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆