Snowflake中的相关子查询不起作用 [英] Correlated subqueries in Snowflake doesn't work

查看:14
本文介绍了Snowflake中的相关子查询不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试在Snowflake中运行以下查询,但失败,错误为Unsupported subquery type cannot be evaluated。该查询在PostgreSQL和Presto等其他SQL引擎中有效,因此看起来Snowflake不支持此类型的查询。

SELECT first_action.date, 
  DATEDIFF('day', first_action.date, returning_action.date) - 1 as diff, 
  APPROXIMATE_SIMILARITY(select MINHASH_COMBINE(value) from (select first_action.user_id_set as value union all select returning_action.user_id_set)) _set
  FROM (select cast(_time as date) as date, minhash(100, _user) as user_id_set from events group by 1) as first_action
  JOIN (select cast(_time as date) as date, minhash(100, _user) as user_id_set from events group by 1) as returning_action 
ON (first_action.date < returning_action.date AND dateadd(day, 14, first_action.date) >= returning_action.date)
group by 1,2

该查询是使用MinHash的典型队列查询。我们计算每一天的MinHash,加入接下来的14天,合并结果,最后计算最终结果。

由于MinHash没有线性MINHASH_COMBINE函数,我们必须使用子查询和UNION ALL才能使其工作,但这也没有工作。:/

我们现在被困住了,因为我们实际上不知道任何解决方法。如有任何帮助,我们将不胜感激!

推荐答案

不确定此操作是否有效,请尝试使用WITH语句分隔first_actionreturning_action

WITH 
first_action as (
    SELECT 
        TRY_CAST(_time AS DATE) as date, 
        MINHASH(100, _user) as user_id_set 
    FROM events 
    GROUP BY 1
),
returning_action as (
    SELECT 
        TRY_CAST(_time AS DATE) as date, 
        MINHASH(100, _user) as user_id_set 
    FROM events 
    GROUP BY 1
),
SELECT 
  first_action.date, 
  DATEDIFF('day', fa.date, ra.date) - 1 as diff, 
  APPROXIMATE_SIMILARITY(
      SELECT MINHASH_COMBINE(value) 
      FROM (
          SELECT fa.user_id_set AS value FROM first_action fa
          UNION ALL  
          SELECT ra.user_id_set AS value FROM returning_action ra
      )
  ) _set
FROM first_action fa
JOIN returning_action ra
ON (fa.date < ra.date AND DATEADD(day, 14, fa.date) >= ra.date)
GROUP BY 1,2

这篇关于Snowflake中的相关子查询不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆