如何将数据从 AWS Postgres RDS 传输到 S3(然后是 Redshift)? [英] How to pipe data from AWS Postgres RDS to S3 (then Redshift)?

查看:71
本文介绍了如何将数据从 AWS Postgres RDS 传输到 S3(然后是 Redshift)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 AWS 数据管道服务将数据从 RDS MySql 数据库传输到 s3,然后再传输到 Redshift,这很有效很好.

I'm using AWS data pipeline service to pipe data from a RDS MySql database to s3 and then on to Redshift, which works nicely.

但是,我也有数据存在于 RDS Postres 实例中,我想以相同的方式传输该实例,但是我很难设置 jdbc 连接.如果这不受支持,是否有解决方法?

However, I also have data living in an RDS Postres instance which I would like to pipe the same way but I'm having a hard time setting up the jdbc-connection. If this is unsupported, is there a work-around?

"connectionString": "jdbc:postgresql://THE_RDS_INSTANCE:5432/THE_DB"

推荐答案

这还不起作用.aws 还没有构建/发布可以很好地连接到 postgres 的功能.不过,您可以在 shellcommandactivity 中执行此操作.您可以编写一些 ruby​​ 或 python 代码来执行此操作,然后使用 scriptUri 将其放入 s3 上的脚本中.您也可以只编写一个 psql 命令将表转储到 csv,然后在该活动节点中使用staging: true"将其通过管道传输到 OUTPUT1_STAGING_DIR.

this doesn't work yet. aws hasnt built / released the functionality to connect nicely to postgres. you can do it in a shellcommandactivity though. you can write a little ruby or python code to do it and drop that in a script on s3 using scriptUri. you could also just write a psql command to dump the table to a csv and then pipe that to OUTPUT1_STAGING_DIR with "staging: true" in that activity node.

像这样:

{
  "id": "DumpCommand",
  "type": "ShellCommandActivity",
  "runsOn": { "ref": "MyEC2Resource" },
  "stage": "true",
  "output": { "ref": "S3ForRedshiftDataNode" },
  "command": "PGPASSWORD=password psql -h HOST -U USER -d DATABASE -p 5432 -t -A -F"," -c "select blah_id from blahs" > ${OUTPUT1_STAGING_DIR}/my_data.csv"
}

我没有运行它来验证,因为启动管道很痛苦:(所以请仔细检查命令中的转义.

i didn't run this to verify because it's a pain to spin up a pipeline :( so double check the escaping in the command.

  • 优点:超级简单,不需要额外的脚本文件上传到 s3
  • 缺点:不完全安全.您的数据库密码将在没有加密的情况下通过网络传输.

看看 aws 刚刚在参数化模板数据管道上推出的新东西:http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html.看起来它将允许对任意参数进行加密.

look into the new stuff aws just launched on parameterized templating data pipelines: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html. it looks like it will allow encryption of arbitrary parameters.

这篇关于如何将数据从 AWS Postgres RDS 传输到 S3(然后是 Redshift)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆