BigQuery-将查询结果导出到本地文件/Google存储空间 [英] BigQuery - Export query results to local file/Google storage

查看:406
本文介绍了BigQuery-将查询结果导出到本地文件/Google存储空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将查询结果从BigQuery导出到本地文件/Google存储.

I want to export query results from BigQuery to local file/Google storage.

我尝试了'bq extract'命令,但不允许将查询作为输入.

I tried 'bq extract' command but it doesn't allow query as input.

Usage: bq extract <source_table> <destination_uris>

我不想提取整个表,因为该表包含许多不需要的列,我需要汇总数据.

I don't want to extract the whole table as the table contains many columns which are not required and I need to aggregate the data.

到目前为止,我唯一能找到的解决方法是使用"bq查询"命令创建表并使用"bq提取"提取数据.

As of now, the only workaround I could find is to create a table using the 'bq query' command and use the 'bq extract' to extract the data.

我正在寻找通过以下类似方法实现此目标的更好方法.

I'm looking for any better way to achieve this by doing something like below.

bq extract 'select dept_id,sum(sal) from temp.employee 
                                 group by dept_id' 'gs://XXXX/employee.csv'

预先感谢

推荐答案

Mikhail Berlyant 所说,

BigQuery不提供直接导出/下载查询的功能 结果保存到GCS或本地文件中.

BigQuery does not provide ability to directly export/download query result to GCS or Local File.

您仍然可以通过三个步骤使用Web UI导出它

You can still export it using the Web UI in just three steps

  1. 配置查询以将结果保存到BigQuery表中并运行它.
  2. 将表格导出到GCS中的存储桶.
  3. 从存储桶中下载.

步骤1

在BigQuery屏幕中,在运行查询之前,请转到更多>查询设置

Step 1

When in BigQuery screen, before running the query go to More > Query Settings

这将打开以下内容

您想在这里

  • 目标:设置查询结果的目标表
  • 项目名称:选择项目.
  • 数据集名称:选择一个数据集.如果您没有,请创建它,然后再回来.
  • 表名:提供所需的任何名称(必须仅包含字母,数字或下划线).
  • 结果大小:允许较大的结果(没有大小限制).

然后将其保存,并将查询配置为保存在特定表中.现在,您可以运行查询.

Then Save it and the Query is configured to be saved in a specific table. Now you can run the Query.

要将其导出到GCP,您必须转到表并单击EXPORT>导出到GCS.

To export it to GCP you have to go to the table and click EXPORT > Export to GCS.

这将打开以下屏幕

选择GCS位置中,定义存储区,文件夹和文件.

In Select GCS location you define the bucket, the folder and the file.

例如,您有一个名为 daria_bucket 的存储桶(仅使用小写字母,数字,连字符(-)和下划线(_).点(.)可以用于形成一个有效的域名.),并想将文件保存在名称为 test 的存储桶根目录中,然后编写(在选择GCS位置"中)

For instances, you have a bucket named daria_bucket (Use only lowercase letters, numbers, hyphens (-), and underscores (_). Dots (.) may be used to form a valid domain name.) and want to save the file(s) in the root of the bucket with the name test, then you write (in Select GCS location)

daria_bucket/test.csv

如果文件太大(超过1 GB),则会出现错误.要修复它,您必须使用通配符将其保存在更多文件中.因此,您需要添加*,就像这样

If the file is too big (more than 1 GB), you'll get an error. To fix it, you'll have to save it in more files using wildcard. So, you'll need to add *, just like that

daria_bucket/test*.csv

这将在表daria_bucket的内部存储从表中提取的所有数据到一个以上的文件中,文件名为test000000000000,test000000000001,test000000000002,... testX.

This is going to store, inside of the bucket daria_bucket, all the data extracted from the table in more than one file named test000000000000, test000000000001, test000000000002, ... testX.

然后转到存储,您将看到存储桶.

Then go to Storage and you'll see the bucket.

深入其中,您会找到一个(或多个)文件.然后您可以从那里下载.

Go inside of it and you'll find the one (or more) file(s). You can then download from there.

这篇关于BigQuery-将查询结果导出到本地文件/Google存储空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆