BigQuery rank()函数-超出资源 [英] BigQuery rank() function - Resources exceeded

查看:46
本文介绍了BigQuery rank()函数-超出资源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张有7列和约850万行的表格.我试图选择允许大结果"选中的目标表.注意由于资源超出错误,我已经不得不将一个较大的查询分解为多个步骤.

I have a table with 7 columns and ~8.5 mil rows. I'm attempting to select into a destination table with "Allow large results" checked. Note I've already had to decompose a larger query into multiple steps because of resource exceeded errors.

SELECT 
    col1,
    col2,
    col3,
    col4,
    RANK() OVER (PARTITION BY col1 ORDER BY col4 DESC) rank
FROM
[dataset.table]

这将返回超出资源"错误.

This returns a "resources exceeded" error.

推荐答案

由于窗口函数的当前实现,当尝试在大型数据集上运行窗口函数时,这些错误是预期的(在这种情况下,这要求大结果标志).

Due to the current implementation of the window functions, these errors are kind of expected when trying to run window functions over big datasets (as in this case, that requires the large results flag).

尽管存在这些限制,但我建议分多个步骤运行查询,如下所示:

While these limitations are in place, I would suggest running the query in multiple steps, as in:

SELECT col1, col2, col3, col4, RANK() OVER (PARTITION BY col1 ORDER BY col4 DESC) rank FROM [dataset.table]
WHERE ABS(HASH(col1)) % 4 = 0

(用0、1、2和3替换0以完成整个过程-如果仍然超出资源,则用更大的数字替换4)

(replace 0 with 1, 2, and 3 to complete the whole process - or 4 with a bigger number if resources still exceeded)

这篇关于BigQuery rank()函数-超出资源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆