如何将数据从AWS Postgres RDS传输到S3(然后称为Redshift)? [英] How to pipe data from AWS Postgres RDS to S3 (then Redshift)?

查看:387
本文介绍了如何将数据从AWS Postgres RDS传输到S3(然后称为Redshift)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用AWS数据管道服务将数据从 RDS MySql 数据库传送到 s3 ,然后打开到 Redshift ,效果很好。

I'm using AWS data pipeline service to pipe data from a RDS MySql database to s3 and then on to Redshift, which works nicely.

但是,我的数据也位于我想以相同的方式使用RDS Postres 实例,但是我很难设置jdbc-connection。如果不支持,是否有解决方法?

However, I also have data living in an RDS Postres instance which I would like to pipe the same way but I'm having a hard time setting up the jdbc-connection. If this is unsupported, is there a work-around?

"connectionString": "jdbc:postgresql://THE_RDS_INSTANCE:5432/THE_DB"


推荐答案

这还行不通。 hasnt构建/发布了可以很好地连接到postgres的功能。尽管您可以在shellcommandactivity中做到这一点。您可以编写一些ruby或python代码来做到这一点,然后使用scriptUri将其放到s3的脚本中。 psql命令将表转储到csv,然后在该活动节点中使用 staging:true将其通过管道传递到OUTPUT1_STAGING_DIR。

this doesn't work yet. aws hasnt built / released the functionality to connect nicely to postgres. you can do it in a shellcommandactivity though. you can write a little ruby or python code to do it and drop that in a script on s3 using scriptUri. you could also just write a psql command to dump the table to a csv and then pipe that to OUTPUT1_STAGING_DIR with "staging: true" in that activity node.

类似这样的东西:

{
  "id": "DumpCommand",
  "type": "ShellCommandActivity",
  "runsOn": { "ref": "MyEC2Resource" },
  "stage": "true",
  "output": { "ref": "S3ForRedshiftDataNode" },
  "command": "PGPASSWORD=password psql -h HOST -U USER -d DATABASE -p 5432 -t -A -F\",\" -c \"select blah_id from blahs\" > ${OUTPUT1_STAGING_DIR}/my_data.csv"
}

我没有执行此操作验证,因为旋转管道很麻烦:(因此,请仔细检查命令中的转义。

i didn't run this to verify because it's a pain to spin up a pipeline :( so double check the escaping in the command.


  • 优点:超级简单,不需要其他要上传到s3的脚本文件

  • 缺点:不太安全,您的数据库密码将通过网络传输而无需加密。

查看刚刚在参数化模板数据管道上启动的AWS新东西: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html 。它似乎可以加密任意参数。

look into the new stuff aws just launched on parameterized templating data pipelines: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html. it looks like it will allow encryption of arbitrary parameters.

这篇关于如何将数据从AWS Postgres RDS传输到S3(然后称为Redshift)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆