从 DynamoDB 迁移到 Spanner/BigTable [英] Migration from DynamoDB to Spanner/BigTable
问题描述
我有一个用例,我需要将 70 TB 的数据从 DynamoDB 迁移到 BigTable 和 Spanner.具有单个索引的表将转到 BigTable,否则它们将转到 Spanner.
I have a use case where I need to migrate 70 TB of data from DynamoDB to BigTable and Spanner. Tables with a single index will go to BigTable else they will go to Spanner.
通过将数据导出到 S3 --> GCS --> Spanner/BigTable,我可以轻松处理历史负载.但具有挑战性的部分是处理 DynamoDB 上同时发生的增量流负载.DynamoDB 中有 300 个表.
I can easily handle the historical loads by exporting the data to S3 --> GCS --> Spanner/BigTable. But the challenging part is to handle the incremental streaming loads simultaneously happening on DynamoDB. There are 300 tables in DynamoDB.
如何以最好的方式处理这件事?以前有人这样做过吗?
How to handle this thing in the best possible manner? Has anyone done this before?
推荐答案
一种方法可以使用 lambdas 来捕获 dynamodb 更改,将更改发布到 GCP pub/sub,然后让 Dataflow 流处理管道处理传入的 pub/sub 消息,根据表将其写入 Spanner 或 BigTable
One approach could be done using lambdas to capture the dynamodb changes, posting the changes to GCP pub/sub, and then having a Dataflow streaming pipeline processing the incoming pub/sub messages, writing it to Spanner or BigTable depending on the table
基本的 DynamoDB->Spanner 解决方案记录在此处:https://cloud.google.com/solutions/migrating-dynamodb-to-cloud-spanner
The basic DynamoDB->Spanner solution is documented here: https://cloud.google.com/solutions/migrating-dynamodb-to-cloud-spanner
这可以适用于处理不同表的不同目的地.
This could be adapted to handle the different destinations for different tables.
这篇关于从 DynamoDB 迁移到 Spanner/BigTable的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!