将数据直接加载到Google BigQuery与先通过Cloud Storage进行存储的优缺点是什么? [英] What are the pros and cons of loading data directly into Google BigQuery vs going through Cloud Storage first?

本文介绍了将数据直接加载到Google BigQuery与先通过Cloud Storage进行存储的优缺点是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此外,直接在BigQuery中进行转换/联接有什么问题吗?我想最大程度地减少我要建立的数据仓库所涉及的组件和步骤的数量(一连串零售商店的简单交易和库存数据.)

Also, is there anything wrong with doing transforms/joins directly within BigQuery? I'd like to minimize the number of components and steps involved for a data warehouse I'm setting up (simple transaction and inventory data for a chain of retail stores.)

推荐答案

通过Cloud Storage加载数据是最快(也是最便宜)的方式. 可以通过应用直接加载(使用流式插入,这会增加一些额外的费用)

Loading data via Cloud Storage is the fastest (and the cheapest) way. Loading directly can be done via app (using streaming insert which add some additional cost)

对于进行转换-如果您计划/需要做的事情可以在BigQuery中完成-您应该在BigQuery中完成:)-这是进行ETL的最好,最快的方法. 但是您应该考虑运行查询的费用(如果您不向Google支付广告位费用-可能是每1TB扫描5美元)

For the doing transformation - if what are you plan/need to do can be done in BigQuery - you should do it in BigQuery :) - it is the best and fastest way of doing ETL. But you should take in account cost of running query (if you not paying Google for slots - it could be 5$ per 1TB scans)

使用复杂的ETL的另一个不错的选择是使用数据流,但是它很快就会变得昂贵,以换取更多的灵活性.

Another good options for complex ETL is using Data Flow - but it can became expensive very quick - in exchange of more flexibility.

这篇关于将数据直接加载到Google BigQuery与先通过Cloud Storage进行存储的优缺点是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆