bigquery输出中的group_concat / string_agg的最大限制是多少? [英] What is the max limit of group_concat/string_agg in bigquery output?

查看:344
本文介绍了bigquery输出中的group_concat / string_agg的最大限制是多少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 group_concat / string_agg (可能是varchar),并且希望确保bigquery不会删除任何数据连接。

解决方案

如果特定查询的内存不足,BigQuery不会丢弃数据;你会得到一个错误。你应该尽量保持你的行大小低于100MB,因为除此之外你会开始犯错。您可以尝试使用如下示例创建大型字符串:

  #standardSQL 
SELECT STRING_AGG(单词)AS单词FROM`bigquery-public-data.samples.shakespeare`;

此表中有164,656行,此查询创建一个字符串,其中包含1,168,286个字符(大约一兆字节在尺寸方面)。不过,如果您在单个执行节点上运行的查询需要的数量超过几百兆,那么您将开始看到一个错误:



<$ p $ (CONCAT(word,corpus))作为单词
从`bigquery-public-data.samples.shakespeare`
CROSS JOIN UNNEST( GENERATE_ARRAY(1,1000));

这会导致错误:

 查询执行过程中超出资源。 

如果您单击UI中的解释选项卡,可以看到失败发生在阶段1,同时构建 STRING_AGG 的结果。在这种情况下,字符串的长度应该是3,303,599,000个字符,或者大小约为3.3 GB。


I am using group_concat/string_agg (possibly varchar) and want to ensure that bigquery won't drop any of the data concatenated.

解决方案

BigQuery will not drop data if a particular query runs out of memory; you will get an error instead. You should try to keep your row sizes below ~100MB, since beyond that you'll start getting errors. You can try creating a large string with an example like this:

#standardSQL
SELECT STRING_AGG(word) AS words FROM `bigquery-public-data.samples.shakespeare`;

There are 164,656 rows in this table, and this query creates a string with 1,168,286 characters (around a megabyte in size). You'll start to see an error if you run a query that requires more than something on the order of hundreds of megabytes on a single node of execution, though:

#standardSQL
SELECT STRING_AGG(CONCAT(word, corpus)) AS words
FROM `bigquery-public-data.samples.shakespeare`
CROSS JOIN UNNEST(GENERATE_ARRAY(1, 1000));

This results in an error:

Resources exceeded during query execution.

If you click on the "Explanation" tab in the UI, you can see that the failure happened during stage 1 while building the results of STRING_AGG. In this case, the string would have been 3,303,599,000 characters long, or approximately 3.3 GB in size.

这篇关于bigquery输出中的group_concat / string_agg的最大限制是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆