在BigQuery中使用UDF时,是否可以在窗口之间保持共享状态? [英] Is it possible to keep a shared state between windows when using UDFs in BigQuery?

查看:45
本文介绍了在BigQuery中使用UDF时,是否可以在窗口之间保持共享状态?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的

This is a follow up question to my previous question about being able to emulate aggregate functions (like in PGSQL) in BigQuery.

上一个问题中提出的解决方案确实适用于每个窗口上应用的函数独立于前一个窗口的情况-例如计算简单平均值等,但是在计算递归函数(例如指数移动平均值)时,公式为: EMA [i] =价格[i] * k + EMA [i-1]×(1-k)

The solution propsed in the previous question does indeed work for cases where the function applied on each window is independant of the previous window - like calculating simple average etc., But when calculating recursive functions like exponential moving average, where the formula is: EMA[i] = price[i]*k + EMA[i-1]×(1−k)

使用上一个问题中的相同示例,

Using the same example from the previous question,

CREATE OR REPLACE FUNCTION temp_db.ema_func(arr ARRAY<int64>, window_size int8)
RETURNS int64 LANGUAGE js AS """
    if(arr.length<=window_size){
        // calculate a simple moving average till end of first window
        var SMA = 0;
        for(var i = 0;i < arr.length; i++){
            SMA = SMA + arr[i]
        }
        return SMA/arr.length
    }else{
        // start calculation of EMA where EMA[i-1] is the SMA we calculated for the first window
        // note: hard-coded constant (k) for the sake of simplicity
        // the problem: where do I get EMA[i-1] or prev_EMA from?
        // in this example, we only need the most recent value, but in general case, we would 
        // potentially have to do other calculations with the new value 
        return curr[curr.length-1]*(0.05) + prev_ema*(1−0.05)
    }
""";

select s_id, temp_db.ema_func(ARRAY_AGG(s_price) over (partition by s_id order by s_date rows 40 preceding), 40) as temp_col
from temp_db.s_table;

在PGSQL中,将状态变量存储为自定义类型非常容易,并且是聚合函数参数的一部分.可以使用BigQuery模仿相同的功能吗?

Storing state variable as a custom type is very easy in PGSQL and is a part of the aggregate function parameters. Would it be possible to do emulate the same functionality with BigQuery?

推荐答案

我认为BigQuery无法通用完成此操作,而是希望了解具体情况并查看是否有合理的解决方法.同时,BQ尚不希望再次提供递归和聚合UDF,因此您可能要提交各自的

i don't think it can be done generically for BigQuery and rather wanted to see the specific case and see if some reasonable workaround is possible. Meantime, again recursiveness and aggregate UDF is something that is not supported [hopefully yet] in BQ, so you might want to submit respective feature request(s).

同时结帐 BQ脚本,但我没有认为您的案子适合那里

Meantime checkout BQ scripting but i don't think your case will fit there

这篇关于在BigQuery中使用UDF时,是否可以在窗口之间保持共享状态?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆