如何通过自定义REST API将数据加载到Redshift中 [英] How to load data into Redshift from a custom REST API

查看：52 发布时间：2021/4/3 19:18:26 amazon-web-services amazon-s3 amazon-redshift amazon-kinesis-firehose

本文介绍了如何通过自定义REST API将数据加载到Redshift中的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是AWS的新手，如果以前曾问过这个问题，请原谅我.

I am new to AWS and please forgive me if this question is asked previously.

我有一个REST API，该API返回2个参数(名称，电子邮件).我想将此数据加载到Redshift中.

I have a REST API which returns 2 parameters (name, email). I want to load this data into Redshift.

我想到制作一个每2分钟启动一次并调用REST API的Lambda函数.该API可能在这2分钟内最多返回3-4条记录.

I thought of making a Lambda function which starts every 2 minutes and call the REST API. The API might return max 3-4 records within this 2 minutes.

因此，在这种情况下，可以只执行插入操作，还是我仍要使用COPY(使用S3)?我只担心性能和无错误(稳健)数据插入.

So, under this situation is it okay to just do a insert operation or I have to still use COPY (using S3)? I am worried only about performance and error-free (robust) data insert.

此外，Lambda函数将每2分钟异步启动，因此插入操作可能会重叠(但数据不会重叠).

Also, the Lambda function will start asynchronously every 2 mins, so there might be a overlap of insert operation (but there won't be an overlap in data).

在这种情况下，如果我使用S3选项，我担心由先前的Lambda调用生成的S3文件将被覆盖并且发生冲突.

At this situation and if I go with S3 option, I am worried the S3 file generated by previous Lambda invoke will be overwritten and a conflict occurs.

长话短说，将较少的记录插入redshift的最佳实践是什么?

Long story short, what is the best practise to insert fewer records into redshift?

PS:我也可以使用其他AWS组件.我什至研究了Firehose，它对我来说很完美，但是它无法将数据加载到Private Subnet Redshift中.

PS: I am okay with using other AWS components as well. I even looked into Firehose which is perfect for me but it can't load data into Private Subnet Redshift.

预先感谢

如何通过自定义REST API将数据加载到Redshift中 [英] How to load data into Redshift from a custom REST API

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何通过自定义REST API将数据加载到Redshift中 [英] How to load data into Redshift from a custom REST API

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭