在BigQuery中将大量数据从美国数据集迁移到欧盟数据集的最佳方法? [英] Best way to migrate large amount of data from US dataset to EU dataset in BigQuery?
问题描述
在一个BigQuery项目中,在大约100万张表中有许多TB,这些项目托管在位于美国的多个数据集中.我需要将所有这些数据移至欧盟托管的数据集中.我这样做的最佳选择是什么?
I have many TBs in about 1 million tables in a single BigQuery project hosted in multiple datasets that are located in the US. I need to move all of this data to datasets hosted in the EU. What is my best option for doing so?
- 我将表导出到Google Cloud Storage并使用加载作业重新导入,但是每个项目每天的加载作业上限为1万
- 我会将其作为带有允许较大结果"的查询并保存到目标表中,但这在跨区域均不起作用
我现在看到的唯一选择是使用BQ流API重新插入所有数据,这会导致成本过高.
The only option I see right now is to reinsert all of the data using the BQ streaming API, which would be cost prohibitive.
在BigQuery中跨区域的许多表中移动大量数据的最佳方法是什么?
What's the best way to move a large volume of data in many tables cross-region in BigQuery?
推荐答案
您有几个选择:
- 使用加载作业,并与Google Cloud支持联系以请求配额例外.他们可能会暂时提供10万美元左右的资助(如果没有,请与我联系,
tigani@google
,我可以这样做). - 使用联合查询作业.也就是说,将数据移至欧盟的GCS存储桶中,然后通过BigQuery查询使用GCS数据源重新导入数据.更多信息此处.
- Use load jobs, and contact Google Cloud Support to ask for a quota exception. They're likely to grant 100k or so on a temporary basis (if not, contact me,
tigani@google
, and I can do so). - Use federated query jobs. That is, move the data into a GCS bucket in the EU, then re-import the data via BigQuery queries with GCS data sources. More info here.
我还将研究是否可以全面提高此配额限制.
I'll also look into whether we can increase this quota limit across the board.
这篇关于在BigQuery中将大量数据从美国数据集迁移到欧盟数据集的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!