使用 S3 和 aws_s3 将 Postgres 数据导入 RDS [英] Import Postgres data into RDS using S3 and aws_s3

查看:71
本文介绍了使用 S3 和 aws_s3 将 Postgres 数据导入 RDS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难将数据从 S3 导入 RDS postgres 实例.根据文档,您可以使用这个语法:

I'm having a hard time importing data from S3 into an RDS postgres instance. According to the docs, you can use this syntax:

aws_s3.table_import_from_s3 (
   table_name text, 
   column_list text, 
   options text, 
   bucket text, 
   file_path text, 
   region text, 
   access_key text, 
   secret_key text, 
   session_token text 
) 

所以,在 pgAdmin 中,我这样做了:

So, in pgAdmin, I did this:

SELECT aws_s3.table_import_from_s3(
  'contacts_1', 
  'firstname,lastname,imported', 
  '(format csv)',
  'com.foo.mybucket', 
  'mydir/subdir/myfile.csv', 
  'us-east-2',
  'AKIAYYXUMxxxxxxxxxxx',
  '3zB4S5jb1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
);

我还尝试将最后一个参数显式设为 NULL.

I also tried it with an explicit NULL for the last parameter.

我得到的错误信息是:

NOTICE:  CURL error code: 51 when attempting to validate pre-signed URL, 1 attempt(s) remaining
NOTICE:  CURL error code: 51 when attempting to validate pre-signed URL, 0 attempt(s) remaining

ERROR:  Unable to generate pre-signed url, look at engine log for details.
SQL state: XX000

我检查了服务器日志,没有进一步的信息.

I checked the server logs and there was no further information.

我已经三重检查了所有参数的正确性.我该如何完成这项工作?

I have triple-checked the correctness of all the parameters. How do I make this work?

更新:

我可以确认我可以使用这些相同的凭据在 Java aws sdk 中执行 s3.getObject().

I can confirm that I can do an s3.getObject() in the Java aws sdk using these same credentials.

推荐答案

这里的主要问题是您需要 1) 将 IAM 角色添加到 RDS 实例以访问 S3 存储桶和 2) 将 S3 端点添加到RDS 实例在其中运行以允许通信的 VPC.

The main issue here is that you need to 1) add a IAM role to the RDS instance to access the S3 bucket and 2) add an S3 endpoint to the VPC where the RDS instance run in order to allow communications.

这是我在shell中使用AWS cli命令使其工作所遵循的程序(正确处理所涉及的环境变量的值),希望它可以帮助:

This is the procedure I followed to make it work, using AWS cli commands in a shell (take care of value properly the environmental variables involved), hope it can help:

  1. 创建 IAM 角色:

$ aws iam create-role 
    --role-name $ROLE_NAME 
    --assume-role-policy-document '{"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Principal": {"Service": "rds.amazonaws.com"}, "Action": "sts:AssumeRole"}]}'

  1. 创建将附加到 IAM 角色的 IAM 策略:

$ aws iam create-policy 
    --policy-name $POLICY_NAME 
    --policy-document '{"Version": "2012-10-17", "Statement": [{"Sid": "s3import", "Action": ["s3:GetObject", "s3:ListBucket"], "Effect": "Allow", "Resource": ["arn:aws:s3:::${BUCKET_NAME}", "arn:aws:s3:::${BUCKET_NAME}/*"]}]}'

  1. 附上政策:

$ aws iam attach-role-policy 
    --policy-arn arn:aws:iam::$AWS_ACCOUNT_ID:policy/$POLICY_NAME 
    --role-name $ROLE_NAME

  1. 将角色添加到特定实例 - 需要为每个新实例重复此步骤:

$ aws rds add-role-to-db-instance 
    --db-instance-identifier $RDS_INSTANCE_NAME 
    --feature-name s3Import 
    --role-arn arn:aws:iam::$AWS_ACCOUNT_ID:role/$ROLE_NAME 
    --region $REGION

  1. 为 S3 服务创建 VPC 端点:

$ aws ec2 create-vpc-endpoint 
    --vpc-id $VPC_ID 
    --service-name com.amazonaws.$REGION.s3
    --route-table-ids $ROUTE_TABLE_ID

可以通过命令获取与创建端点的VPC相关的路由表id

The route table id related to the VPC where the endpoint is created can be retrieved through the command

$ aws ec2 describe-route-tables | jq -r '.RouteTables[] | "(.VpcId) (.RouteTableId)"'

这篇关于使用 S3 和 aws_s3 将 Postgres 数据导入 RDS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆