如何针对具有记录字段的表创建视图？ [英] How to create a view against a table that has record fields?

查看：116 发布时间：2018/5/7 17:32:07 google-bigquery

本文介绍了如何针对具有记录字段的表创建视图？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们每周都有一个备份流程，将我们的Google Appengine数据存储产品导出到Google云端存储，然后导入Google BigQuery。每周，我们创建一个名为 YYYY_MM_DD 的新数据集，其中包含当天的生产表副本。随着时间的推移，我们收集了许多数据集，例如 2014_05_10 ， 2014_05_17 等。我想创建一个数据集 Latest_Production_Data ，它包含最近的 YYYY_MM_DD 数据集中每个表的视图。这将使下游报表更容易编写一次查询，并始终检索最近的数据。

为此，我使用获取最新数据集的代码和数据集包含在BigQuery API中的所有表的名称。然后，对于每个表格，我都会启动一个，但我不希望重复数据，如果我完全可以避免它。

div>
这是我编写的用于动态生成解决方法代码 > SELECT 语句为每个表：
def get_leaf_column_selectors（dataset，table）： schema = table_service.get（ projectId = BQ_PROJECT_ID， datasetId = dataset， tableId = table ）.execute（）['schema'] return，\\\ .join（[ _get_leaf_selectors（，top_field） for schema [fields] ]） def _get_leaf_selectors（前缀，字段）：如果前缀： format = prefix +。％s else： format =％s 如果'fields'不在字段中：＃基本情况实际名称=格式％字段[名称] safe_name = actual_name.replace（。，_）返回％s作为％s％（actual_name，safe_name）其他：＃递归案例返回，\\\ .join（[ _get_leaf_selectors（格式％field [name]，sub_field）用于字段[ fields] ]）

We have a weekly backup process which exports our production Google Appengine Datastore onto Google Cloud Storage, and then into Google BigQuery. Each week, we create a new dataset named like YYYY_MM_DD that contains a copy of the production tables on that day. Over time, we have collected many datasets, like 2014_05_10, 2014_05_17, etc. I want to create a data set Latest_Production_Data that contains a view for each of the tables in the most recent YYYY_MM_DD dataset. This will make it easier for downstream reports to write their query once and always retrieve the most recent data.

To do this, I have code that gets the most recent dataset and the names of all the tables that dataset contains from the BigQuery API. Then, for each of these tables, I fire a tables.insert call to create a view that is a SELECT * from the table I am looking to create a reference to.

This fails for tables that contain a RECORD field, from what looks to be a pretty benign column-naming rule.

For example, I have this table:

For which I issue this API call:
{ 'tableReference': { 'projectId': 'redacted', 'tableId': u'AccountDeletionRequest', 'datasetId': 'Latest_Production_Data' } 'view': { 'query': u'SELECT * FROM [2014_05_17.AccountDeletionRequest]' }, }
This results in the following error:

HttpError: https://www.googleapis.com/bigquery/v2/projects//datasets/Latest_Production_Data/tables?alt=json returned "Invalid field name "__key__.namespace". Fields must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long.">

When I execute this query in the BigQuery web console, the columns are renamed to translate the . to an _. I kind of expected the same thing to happen when I issued the create view API call.

Is there an easy way I can programmatically create a view for each of the tables in my dataset, regardless of their underlying schema? The problem I'm encountering now is for record columns, but another problem I anticipate is for tables that have repeated fields. Is there some magic alternative to SELECT * that will take care of all these intricacies for me?

Another idea I had was doing a table copy, but I would prefer not to duplicate the data if I can at all avoid it.
解决方案
Here is the workaround code I wrote to dynamically generate a SELECT statement for each of the tables:
def get_leaf_column_selectors(dataset, table): schema = table_service.get( projectId=BQ_PROJECT_ID, datasetId=dataset, tableId=table ).execute()['schema'] return ",\n".join([ _get_leaf_selectors("", top_field) for top_field in schema["fields"] ]) def _get_leaf_selectors(prefix, field): if prefix: format = prefix + ".%s" else: format = "%s" if 'fields' not in field: # Base case actual_name = format % field["name"] safe_name = actual_name.replace(".", "_") return "%s as %s" % (actual_name, safe_name) else: # Recursive case return ",\n".join([ _get_leaf_selectors(format % field["name"], sub_field) for sub_field in field["fields"] ])

这篇关于如何针对具有记录字段的表创建视图？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何针对具有记录字段的表创建视图？ [英] How to create a view against a table that has record fields?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何针对具有记录字段的表创建视图？ [英] How to create a view against a table that has record fields?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭