由于查询顺序,bigquery资源受到限制 [英] bigquery resource limited exeeded due to order by

查看:48
本文介绍了由于查询顺序,bigquery资源受到限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行以下查询,但收到超出资源限制"错误.如果删除最后一行(order by子句),它将起作用:

Whem I am running the following query, I get a 'resource limited exceeded'-error. If I remove the last line (the order by clause) it works:

SELECT
  id,
  INTEGER(-position / (CASE WHEN fallback = 0 THEN 2 ELSE 1 END)) AS major_sort
FROM (
  SELECT
    id,
    fallback,
    ROW_NUMBER() OVER(PARTITION BY fallback) AS position
  FROM
    [table] AS r
  ORDER BY
    r.score DESC ) AS r
ORDER BY major_sort DESC

实际上,最后一行是:

ORDER BY major_sort DESC, r.score DESC

但这都不会让事情变得更糟.

But neither that would probably make things even worse.

有什么主意我该如何更改查询以解决此问题?

Any idea how I could change the query to circumvent this problem?

((((如果您想知道此查询的作用是什么:table包含具有多个后备策略的'排名',我想创建这样的顺序:'AABAABAABAAB',其中'A'和'B'为后备策略.如果您有更好的主意如何实现这一目标,请随时告诉我:D))

((If you wonder what this query does: the table contains a 'ranking' with multiple fallback strategies and I want to create an ordering like this: 'AABAABAABAAB' with 'A' and 'B' being the fallback strategies. If you have a better idea how to achieve this; please feel free to tell me :D))

推荐答案

顶级ORDER BY始终会序列化查询的执行:出于排序目的,它将强制将所有计算集中到单个节点上.这就是资源超出错误的原因.

A top-level ORDER BY will always serialize execution of your query: it will force all computation onto a single node for the purpose of sorting. That's the cause of the resources exceeded error.

我不确定我是否完全理解查询的目标,因此很难提出替代方案,但是您可以考虑将ORDER BY子句放在OVER(PARTITION BY ...)子句中.对单个分区进行排序可以并行完成,并且可能更接近您想要的分区.

I'm not sure I fully understand your goal with the query, so it's hard to suggest alternatives, but you might consider putting an ORDER BY clause within the OVER(PARTITION BY ...) clause. Sorting a single partition can be done in parallel and may be closer to what you want.

有关订购的更多一般建议:

More general advice on ordering:

  • 在BQ查询期间不保留订单,因此,如果要在输入行上保留一个订单,请确保将其作为额外的字段编码在数据中.

  • Order is not preserved during BQ queries, so if there's an ordering that you want to preserve on the input rows, make sure it's encoded in your data as an extra field.

用于大量全局排序数据的用例在某种程度上受到限制.通常,当用户使用ORDER BY遇到资源限制时,我们发现他们实际上是在寻找稍微不同的东西(本地排序数据或前N个"),并且有可能完全摆脱全局ORDER BY .

The use cases for large amounts of globally-sorted data are somewhat limited. Often when users run into resource limitations with ORDER BY, we find that they're actually looking for something slightly different (locally ordered data, or "top N"), and that it's possible to get rid of the global ORDER BY completely.

这篇关于由于查询顺序,bigquery资源受到限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆