找出每个字段在Google Big Query中占用的空间量 [英] Find out the amount of space each field takes in Google Big Query

查看:126
本文介绍了找出每个字段在Google Big Query中占用的空间量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想优化Big Query和Google存储表的空间。有没有办法很容易地找出表中每个字段的累积空间?这在我的情况下并不简单,因为我有一个复杂的层次结构,并且有许多重复的记录。 您可以在Web中执行此操作用户界面通过简单地键入(而不是运行)在查询改变到您感兴趣的领域

  SELECT< column_name> 
FROM YourTable

并查看由相应大小组成的验证消息





重要 - 您不需要运行它 - 只需检查bytesProcessed的验证消息,这将是相应列的大小



验证是免费的,并调用所谓的干运行



如果您需要为许多表格或具有多列的表格执行此类列分析,则可以使用 Tables.get API 以获取表格模式;然后遍历所有字段并构建相应的SELECT语句,最后 Dry Run 它(在每个列的循环内)并获得 totalBytesProcessed ,正如你已经知道的那样,是相应列的大小

I want to optimize the space of my Big Query and google storage tables. Is there a way to find out easily the cumulative space that each field in a table gets? This is not straightforward in my case, since I have a complicated hierarchy with many repeated records.

解决方案

You can do this in Web UI by simply typing (and not running) below query changing to field of your interest

SELECT <column_name>
FROM YourTable

and looking into Validation Message that consists of respective size

Important - you do not need to run it – just check validation message for bytesProcessed and this will be a size of respective column

Validation is free and invokes so called dry-run

If you need to do such "columns profiling" for many tables or for table with many columns - you can code this with your preferred language using Tables.get API to get table schema ; then loop thru all fields and build respective SELECT statement and finally Dry Run it (within the loop for each column) and get totalBytesProcessed which as you already know is the size of respective column

这篇关于找出每个字段在Google Big Query中占用的空间量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆