将带有变量的循环转换为 BigQuery SQL [英] Converting loop with variables to BigQuery SQL

查看:18
本文介绍了将带有变量的循环转换为 BigQuery SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有数千个脚本,其中包括数据集循环并使用变量进行累积.例如:

I have thousands of script that include loop over dataset and use variables for accumulation. Eg:

// assuming that 'ids' is populated from some BQ table
ids = [1, 2, 3, 4, 5]
var1 = "v1"   //initialize variable
var2 = "v2"   //initialize variable

for id in ids
    var1 = var2
    if (id > 2)
      var2 = var1 + "-" + item
    else
      var2 = id
    print(id, var1, var2)

这将产生以下输出:

1,v2,1
2,1,2
3,2,2-3
4,2-3,2-3-4
5,2-3-4,2-3-4-5

其他需要考虑的事项:

  • 脚本可以包含 n 个变量.
  • 变量可以作为累加器或可以简单地使用文字.
  • 循环中的条件可以是更复杂的,多个嵌套的 if/else 条件.
  • 脚本也可以有嵌套循环,它会循环 n 次或直到不满足某些条件.

是否可以在 BigQuery SQL 中支持它?我也对 BigQuery UDF 开放,但 UDF 不支持可变状态.

Is it possible to support it in BigQuery SQL? I'm open for BigQuery UDF as well but UDF does not support mutable state.

此外,由于有数千个此类脚本,因此我不想手动执行,而是希望自动转换.解析这些脚本对我来说不是问题,我只需要将其转换为等效的 BigQuery SQL.让我知道如何解决这个问题.

Also since there are thousands of such scripts, instead of doing it manually, I want to automate conversion. Parsing these scripts is not an issue for me, I only need to convert this to equivalent BigQuery SQL. Let me know how I can approach this problem.

虽然我更倾向于仅使用 SQL 的方法,但 BigQuery 脚本 似乎也是一个很有前景的选择.

Though I'm more inclined to use SQL only approach but BigQuery script also seems promising option.

推荐答案

我能够为上述示例脚本转换 SQL 查询:

I was able to convert SQL query for the aforementioned sample script:

with  t as
(select 1 as id union all
select 2 as id union all
select 3 as id union all
select 4 as id union all
select 5 as id)

select id, if(id = 1, 'v2', lag(var2) over(order by id)) var1, var2
from (
    select
      id,
      if(
        id > 2,
        string_agg(if(id > 1, concat('', '', cast(id as string)), null), '-') over (rows between unbounded preceding and current row),
        cast(id as string)
      ) var2
    from t
  )

像这样,我还能够将大部分脚本转换为纯 SQL.只是围绕它构建模型更复杂,但至少我正在推进手动转换.

And like that I'm also able to convert most of the scripts in pure SQL. It's just building model around it is more involved but at least I'm moving ahead with manual conversions.

这篇关于将带有变量的循环转换为 BigQuery SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆