Google Big Query数据测量 [英] Google Big Query data measurement

查看:89
本文介绍了Google Big Query数据测量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Google Big Query中,如果我对视图运行查询,它会告诉我扫描了多少数据以回答它。 (BQ账单作为此功能的一部分)。

然而,有没有办法查看我在BQ环境中有多少数据,而无需查询任何许多视图目前在项目中?我的目标是每天测量BQ中的数据量。

解决方案

快速检查您的库存的特定数据集

您可以将此逻辑扩展到联合多个数据集并执行您感兴趣的任何集合



#legacySQL
SELECT table_id,
DATE(creation_time / 1000)AS creation_date,
DATE(last_modified_time / 1000)AS last_modified_date,
row_count,
size_bytes,
CASE
when type = 1 THEN'table'
when type = 2 THEN'view'
WHEN type = 3 THEN'external'
ELSE'?'
END AS类型,
TIMESTAMP(creation_time / 1000)AS creation_time,
TIMESTAMP(last_modified_time / 1000)AS last_modified_time ,
dataset_id,
project_id
FROM [project.dataset1 .__ TABLES__],
[project.dataset2 .__ TABLES__],
[project.dataset3 .__ TABLES__],
[project.dataset4 .__ TABLES__],
[project.dataset5 .__ TABLES__]

取决于数据集的大小(根据表中的表的数量)在查询上方的某个点可以开始抱怨。所以你可能需要批量你的统计。希望这有助于

In Google Big Query, if I run a query against a view, it tells me how much data was scanned to answer it. (BQ bills as a function of this).

However is there a way to see how much data I have in the BQ environment in general without querying any of the many views present in the project? My goal is to measure the amount of data in BQ on a daily basis.

解决方案

Hope below will give you an idea on how to quickly check your inventory for specific dataset
You can extend this logic to union multiple datasets and doing whatever aggregation you are interested in

#legacySQL
SELECT table_id,
    DATE(creation_time/1000) AS creation_date,
    DATE(last_modified_time/1000) AS last_modified_date,
    row_count,
    size_bytes,
    CASE
        WHEN type = 1 THEN 'table'
        WHEN type = 2 THEN 'view'
        WHEN type = 3 THEN 'external'
        ELSE '?'
    END AS type,
    TIMESTAMP(creation_time/1000) AS creation_time,
    TIMESTAMP(last_modified_time/1000) AS last_modified_time,
    dataset_id,
    project_id
FROM [project.dataset1.__TABLES__],   
     [project.dataset2.__TABLES__],
     [project.dataset3.__TABLES__],
     [project.dataset4.__TABLES__],
     [project.dataset5.__TABLES__]

Depends on size of datasets (in terms of number of tables in them) at some point above query can start complaining. so you might need to batch your stats. hope this helps

这篇关于Google Big Query数据测量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆