在数据流中使用 `fromTable` 和 `fromQuery("SELECT * ...")` 时,`BigQueryIO` 是否有区别? [英] Is there a difference in `BigQueryIO` when you use `fromTable` vs `fromQuery("SELECT * ...")` in dataflow?

查看:16
本文介绍了在数据流中使用 `fromTable` 和 `fromQuery("SELECT * ...")` 时,`BigQueryIO` 是否有区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当您需要在数据流作业中从 bigquery 的一个或多个表中读取所有数据时,我会说有两种方法.第一种方法是将 BigQueryIOfrom 结合使用,后者读取有问题的表,第二种方法是使用 fromQuery 在其中指定一个从同一个表中读取所有数据的查询.所以我的问题是:

  • 使用其中一种是否有任何成本或性能优势?

我在文档中没有找到任何关于此的内容,但我真的很想知道.我想也许 read 会更快,因为您不需要运行扫描数据的查询,这意味着它更类似于您在 BigQuery UI 中的预览功能.如果这是真的,它也可能便宜得多,但如果它们的成本相同,那就有意义了.

简而言之,两者有什么区别:

BigQueryIO.read(...).from(tableName)

还有

BigQueryIO.read(...).fromQuery("SELECT * FROM " + tableName)

解决方案

fromfromQuery(SELECT * FROM ...) 既便宜又快捷.>

  • from 直接导出表,导出数据免费BigQuery.
  • fromQuery(SELECT * FROM ...) 将首先扫描整个表($5/TB)并导出结果.

When you need to read all the data from one or more tables in bigquery in a dataflow job there are two approaches to it I would say. The first one is to use BigQueryIO with from, which reads the table in question, and the second approach is to use fromQuery where you specify a query that reads all the data from the same table. So my question is:

  • Is it any cost or performance benefit for using one over the other?

I haven't find anything in the docs about this, but I would really like to know. I imagine that maybe read is faster since you don't need to run a query that scans the data, meaning it is more similar to the preview functionality you have in BigQuery UI. If that is true it might also be much cheaper, but it make sense if they both cost the same.

So in short, what is the difference between:

BigQueryIO.read(...).from(tableName)

And

BigQueryIO.read(...).fromQuery("SELECT * FROM " + tableName)

解决方案

from is both cheaper and faster than fromQuery(SELECT * FROM ...).

  • from directly exports the table and exporting data is free for BigQuery.
  • fromQuery(SELECT * FROM ...) will first scan the entire table ($5/TB) and export the result.

这篇关于在数据流中使用 `fromTable` 和 `fromQuery("SELECT * ...")` 时,`BigQueryIO` 是否有区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆