如何将数据从 AWS Postgres RDS 传输到 S3(然后是 Redshift)? [英] How to pipe data from AWS Postgres RDS to S3 (then Redshift)?
问题描述
我正在使用 AWS 数据管道服务将数据从 RDS MySql
数据库传输到 s3
,然后再传输到 Redshift
,这很有效很好.
I'm using AWS data pipeline service to pipe data from a RDS MySql
database to s3
and then on to Redshift
, which works nicely.
但是,我也有数据存在于 RDS Postres
实例中,我想以相同的方式传输该实例,但是我很难设置 jdbc 连接.如果这不受支持,是否有解决方法?
However, I also have data living in an RDS Postres
instance which I would like to pipe the same way but I'm having a hard time setting up the jdbc-connection. If this is unsupported, is there a work-around?
"connectionString": "jdbc:postgresql://THE_RDS_INSTANCE:5432/THE_DB"
推荐答案
这还不起作用.aws 还没有构建/发布可以很好地连接到 postgres 的功能.不过,您可以在 shellcommandactivity 中执行此操作.您可以编写一些 ruby 或 python 代码来执行此操作,然后使用 scriptUri 将其放入 s3 上的脚本中.您也可以只编写一个 psql 命令将表转储到 csv,然后在该活动节点中使用staging: true"将其通过管道传输到 OUTPUT1_STAGING_DIR.
this doesn't work yet. aws hasnt built / released the functionality to connect nicely to postgres. you can do it in a shellcommandactivity though. you can write a little ruby or python code to do it and drop that in a script on s3 using scriptUri. you could also just write a psql command to dump the table to a csv and then pipe that to OUTPUT1_STAGING_DIR with "staging: true" in that activity node.
像这样:
{
"id": "DumpCommand",
"type": "ShellCommandActivity",
"runsOn": { "ref": "MyEC2Resource" },
"stage": "true",
"output": { "ref": "S3ForRedshiftDataNode" },
"command": "PGPASSWORD=password psql -h HOST -U USER -d DATABASE -p 5432 -t -A -F"," -c "select blah_id from blahs" > ${OUTPUT1_STAGING_DIR}/my_data.csv"
}
我没有运行它来验证,因为启动管道很痛苦:(所以请仔细检查命令中的转义.
i didn't run this to verify because it's a pain to spin up a pipeline :( so double check the escaping in the command.
- 优点:超级简单,不需要额外的脚本文件上传到 s3
- 缺点:不完全安全.您的数据库密码将在没有加密的情况下通过网络传输.
看看 aws 刚刚在参数化模板数据管道上推出的新东西:http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html.看起来它将允许对任意参数进行加密.
look into the new stuff aws just launched on parameterized templating data pipelines: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html. it looks like it will allow encryption of arbitrary parameters.
这篇关于如何将数据从 AWS Postgres RDS 传输到 S3(然后是 Redshift)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!