BigQuery中的标准SQL有相当于表格通配符的功能吗? [英] Is there an equivalent of table wildcard functions in BigQuery with standard SQL?
问题描述
在传统 SQL
中,用户可以使用表通配符函数,如 TABLE_DATE_RANGE
, TABLE_QUERY
和 TABLE_DATE_RANGE_STRICT
。
是否有与标准 SQL
类似的功能?
TABLE_DATE_RANGE
和旧式SQL中的TABLE_DATE_RANGE_STRICT
)或其他复杂标准(由旧式SQL中的 TABLE_QUERY
支持)。使用标准SQL,BigQuery提供了与下面描述的相同的功能。 以下传统SQL查询使用 TABLE_QUERY
通配符函数可以使用标准SQL重写。
传统SQL查询(使用 TABLE_QUERY
):
SELECT SUM(value1)
FROM TABLE_QUERY([myproject:mydataset],table_id ='mydailytable_20150105' OR
table_id ='mydailytable_20150106'OR table_id ='maydailytable_20150110')
GROUP BY value2;
传统SQL查询(使用 TABLE_DATE_RANGE
):
pre $ SELECT SUM(value1)
FROM TABLE_DATE_RANGE([myproject:mydataset],TIMESTAMP (2015-01-05),TIMESTAMP(2015-01-10))
标准SQL查询:
SELECT SUM(value1)
FROM myproject.mydataset。 mydailytable_ *`
WHERE _TABLE_SUFFIX ='20150105'
OR _TABLE_SUFFIX ='20150106'
OR _TABLE_SUFFIX ='20150110'
GROUP BY value2;在上面的查询中,通配符表 myproject.mydataset.mydailytable _ * 匹配数据集 myproject.mydataset
中具有 table_id
以<$ c $开头的所有表C> mydailytable _
。例如,要匹配数据集中的所有表,用户可以为通配符使用空前缀。因此, myproject.mydataset。*
会匹配数据集中的所有表。
$ b 由于 * 是一个特殊字符,在查询中使用它们时必须引用通配符表名。
_TABLE_SUFFIX
伪列:
$ b _TABLE_SUFFIX
伪列的类型为STRING并可以像任何其他列一样使用。它是一个保留的列名称,因此在将它用作SELECT列表的一部分时需要使用别名。
此功能的官方文档可以在这里找到:
https://cloud.google.com/bigquery/docs/wildcard-tables
https://cloud.google.com/bigquery/docs/querying-wildcard-tables
In legacy SQL
, users can use table wildcard functions like TABLE_DATE_RANGE
, TABLE_QUERY
and TABLE_DATE_RANGE_STRICT
.
Is there a similar feature with standard SQL
?
解决方案 In legacy SQL, users can reference data from a subset of tables in a dataset using table wildcard functions. In standard SQL, users can achieve the same result using UNION ALL
. However, this approach may not be convenient when users want to dynamically determine the set of tables using, for example, either a date range (supported using TABLE_DATE_RANGE
and TABLE_DATE_RANGE_STRICT
in legacy SQL) or other complex criteria (supported by TABLE_QUERY
in legacy SQL). With Standard SQL, BigQuery offers an equivalent to this described below.
The following legacy SQL query that uses the TABLE_QUERY
wildcard function can be rewritten using standard SQL.
Legacy SQL query (using TABLE_QUERY
):
SELECT SUM(value1)
FROM TABLE_QUERY([myproject:mydataset],"table_id = 'mydailytable_20150105' OR
table_id = 'mydailytable_20150106' OR table_id = 'maydailytable_20150110'")
GROUP BY value2;
Legacy SQL query (using TABLE_DATE_RANGE
):
SELECT SUM(value1)
FROM TABLE_DATE_RANGE([myproject:mydataset], TIMESTAMP("2015-01-05"), TIMESTAMP("2015-01-10"))
Standard SQL query:
SELECT SUM(value1)
FROM `myproject.mydataset.mydailytable_*`
WHERE _TABLE_SUFFIX = '20150105'
OR _TABLE_SUFFIX = '20150106'
OR _TABLE_SUFFIX = '20150110'
GROUP BY value2;
In the above query, the wildcard table myproject.mydataset.mydailytable_*
matches all tables in the dataset myproject.mydataset
that have table_id
starting with mydailytable_
. For example, to match all tables in the dataset the user can use an empty prefix for the wildcard. So, myproject.mydataset.*
matches all tables in the dataset.
Since *
is a special character, wildcard table names must be quoted when using them in a query.
The _TABLE_SUFFIX
pseudo column:
The _TABLE_SUFFIX
pseudo column has type STRING and can be used just like any other column. It is a reserved column name, so it needs to be aliased when using it as part of the SELECT list.
Official documentation for this feature is available here:
https://cloud.google.com/bigquery/docs/wildcard-tables
https://cloud.google.com/bigquery/docs/querying-wildcard-tables
这篇关于BigQuery中的标准SQL有相当于表格通配符的功能吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!