添加文件名作为导入到BigQuery的列？ [英] Add filename as column on import to BigQuery?

查看：116 发布时间：2018/5/7 17:39:46 google-bigquery google-cloud-storage

本文介绍了添加文件名作为导入到BigQuery的列？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是一个关于将数据文件从Google Cloud Storage导入到BigQuery的问题。

我有许多JSON文件遵循严格的命名约定，数据不包含在JSON数据本身中。

例如：

  xxx_US_20170101.json.gz 
 xxx_GB_20170101.json.gz 
 xxx_DE_20170101.json.gz

client_country_date.json.gz 目前，我在Ruby应用程序中有一些令人费解的进程，它读取文件，追加附加数据，然后将其写回到一个文件中，然后将其导入到BigQuery中客户端的单日表。

我想知道是否可以抓取并解析文件名作为导入到BigQuery的一部分？然后，我可以放弃复杂的Ruby进程，偶尔会在更大的文件上失败。
您可以定义一个指向您的文件的外部表：

请注意，表格类型是外部表，它指向具有 * glob的多个文件。

现在您可以查询所有这些文件中的数据以及查询元列 _FILE_NAME ：
#standardSQL SELECT *，_FILE_NAME文件名 FROM`project.dataset.table`
您现在可以将这些结果保存到新的本地表中。

This is a question about importing data files from Google Cloud Storage to BigQuery.

I have a number of JSON files that follow a strict naming convention to include some key data not included in the JSON data itself.

For example:
xxx_US_20170101.json.gz xxx_GB_20170101.json.gz xxx_DE_20170101.json.gz
Which is client_country_date.json.gz At the moment, I have some convoluted processes in a Ruby app that reads the files, appends the additional data and then writes it back to a file that is then imported into a single daily table for the client in BigQuery.

I am wondering if it is possible to grab and parse the filename as part of the import to BigQuery? I could then drop the convoluted Ruby processes which occasionally fail on larger files.
解决方案
You could define an external table pointing to your files:

Note that the table type is "external table", and that it points to multiple files with the * glob.

Now you can query for all data in these files, and query for the meta-column _FILE_NAME:
#standardSQL SELECT *, _FILE_NAME filename FROM `project.dataset.table`
You can now save these results to a new native table.

这篇关于添加文件名作为导入到BigQuery的列？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

添加文件名作为导入到BigQuery的列？ [英] Add filename as column on import to BigQuery?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

添加文件名作为导入到BigQuery的列？ [英] Add filename as column on import to BigQuery?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭