AWS DynamoDB 流到 Redshift [英] AWS DynamoDB Stream into Redshift

本文介绍了AWS DynamoDB 流到 Redshift的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们希望将数据从 DynamoDB NoSQL 以流的形式连续移动到 Redshift 数据库中.我很难理解 AWS 中的所有新术语/技术.有

1)

解决方案

Amazon Kinesis 可以实时收集、处理和分析视频和数据流.

  • 使用 Kinesis Video Streams 来捕获、处理和存储视频流以进行分析和机器学习.
  • 使用 Kinesis Data Streams 构建自定义应用程序,使用流行的流处理框架分析数据流.
  • 使用Kinesis Data Firehose 将数据流加载到 AWS 数据存储中.
  • 使用 Kinesis Data Analytics 通过 SQL 分析数据流.

DynamoDB 流与 Kinesis Data Stream 的作用相同,但它是由 DynamoDB 中的新数据/更改数据自动生成的.这允许在向 DynamoDB 表添加新数据或更改数据时通知应用程序.

Kinesis Data Firehose 可以自动将流输出到 Redshift(以及其他目的地).

AWS Lambda 无需预置或管理服务器即可运行代码.您只需为所消耗的计算时间付费——当您的代码未运行时无需付费.您几乎可以为任何类型的应用程序或后端服务运行代码——所有这些都是零管理的.

Lambda 可用于检查通过流传入的数据.例如,它可以用于处理不需要的数据格式或跳过数据.

综合起来,您可以在 DynamoDB 中添加/修改数据.这将导致发送一个 DynamoDB 流,其中包含有关更改的信息.AWS Lambda 函数可以检查数据并操作/删除消息.如果可以,然后将数据转发到 Kinesis Data Firehose 以自动将数据插入 Amazon Redshift.

这是一个例子:

  • 银行交易存储在 DynamoDB 中
  • DynamoDB Streams 将其发送到 Lambda 函数
  • Lambda 函数会查看交易并检索有关银行账户的信息.如果帐户中有足够余额,该函数将退出并且什么也不做.
  • 如果帐户中的余额不足,它可以通过 Amazon SES 发送电子邮件通知帐户持有人.然后,它可以将数据发送到 Firehose,然后将其存储在 Redshift 中以报告逾期帐户.

一起使用这些系统的好处是它们可以以最少的编码提供丰富的应用程序功能.在这个例子中,只有 Lambda 函数需要编码——其余的通过将各种组件链接在一起来工作.此外,它是完全无服务器的——也就是说,无需在 Amazon EC2 实例上运行应用程序.

We would like to move data from DynamoDB NoSQL into Redshift Database continously as a stream. I am having hard time understand all the new terms/technologies in AWS. There is

1) DynamoDB Streams

2) AWS Lambda

3) AWS Kinesis Firehose

Can someone provide a brief summary of each. What are DynamoDB streams? How does this differ from AmazonKinesis? After reading all the resources, this is my hypothesis understanding, please verify below.

(a) I assume DynamoDB Streams, create the streaming data of NoSQL, and start sending it out. It is the Sender.

(b) Lambda allows people for only time consumed, it is the time for rent Server which handles the DynamoDB Stream.

(c) Kinesis FireHose Converts the DynamoDB Stream, and places into Redshift.

(d) AmazonQuickSight is their business intelligence tool,

Is that the correct understanding of the glossary terms? Reviewing Stack link, wanted more thorough information.

解决方案

Amazon Kinesis can collect, process, and analyze video and data streams in real time.

  • Use Kinesis Video Streams to capture, process, and store video streams for analytics and machine learning.
  • Use Kinesis Data Streams to build custom applications that analyze data streams using popular stream processing frameworks.
  • Use Kinesis Data Firehose to load data streams into AWS data stores.
  • Use Kinesis Data Analytics to analyze data streams with SQL.

DynamoDB streams are effective the same as a Kinesis Data Stream, but it is automatically generated by new/changed data in DynamoDB. This allows applications to be notified when new data is added to a DynamoDB table, or when data is changed.

A Kinesis Data Firehose can automatically output a stream into Redshift (amongst other destinations).

AWS Lambda can run code without provisioning or managing servers. You pay only for the compute time you consume — there's no charge when your code isn't running. You can run code for virtually any type of application or backend service — all with zero administration.

Lambda is useful for inspecting data coming through a stream. For example, it could be used to manipulate the data format or skip-over data that is not required.

Putting it all together, you could have data added/modified in DynamoDB. This would cause a DynamoDB Stream to be sent that contains information about the change. An AWS Lambda function could inspect the data and manipulate/drop the message. If could then forward the data to Kinesis Data Firehose to automatically insert the data into Amazon Redshift.

Here's an example:

  • A bank transaction is stored in DynamoDB
  • DynamoDB Streams sends it to a Lambda function
  • The Lambda function looks at the transaction and also retrieves information about the bank account. If there is sufficient balance in the account, the function exits and does nothing.
  • If there is insufficient balance in the account, it could send an email via Amazon SES telling the account holder. It could then send the data to Firehose that stores it in Redshift for reporting of overdue accounts.

The benefit of using these systems together is that they can provide rich application functionality with minimal coding. In this example, only the Lambda function needed coding -- the rest worked via linking together various components. Also, it was totally serverless — that is, there was no need to run an application on an Amazon EC2 instance.

这篇关于AWS DynamoDB 流到 Redshift的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆