在Snowflake的公用表表达式中使用&amp；Match_Recognition&amp； [英] Using &quot;match_recognize&quot; in a Common Table Expression in Snowflake

查看：15 发布时间：2022/6/29 22:13:17 sql snowflake-cloud-data-platform common-table-expression

本文介绍了在Snowflake的公用表表达式中使用&amp；Match_Recognition&amp；的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

更新：已回复here。

我正在组合一个有点复杂的查询，以便在Snowflake中对大型时间序列数据集执行事件检测、连接和基于时间的绑定。我最近注意到，match_recognize使我能够雄辩地检测时间序列事件，但每当我尝试在公用表表达式(with .. as ..)中使用match_recognize表达式时，我收到以下错误：

SQL编译错误：此上下文不支持Match_Recognition。

我做了很多搜索/阅读，但没有发现CTE中match_recognize的任何文档限制。以下是我的问题：

with clean_data as (
    -- Remove duplicate entries
    select distinct id, timestamp, measurement
    from dataset
),

label_events as (
    select *
    from clean_data
        match_recognize (
            partition by id
            order by timestamp
            measures
                match_number() as event_number
            all rows per match
            after match skip past last row
            pattern(any_row row_between_gaps+)
            define
                -- Classify contiguous sections of datapoints with < 20min between adjacent points.
                row_between_gaps as datediff(minute, lag(timestamp), timestamp) < 20
        )
)

-- Do binning with width_bucket/etc. here
select id, timestamp, measurement, event_number
from label_events;

我收到了与上面相同的错误。

这是我没有看到的限制，还是我做错了什么？

推荐答案

来自菲利普·霍法的评论：MATCH_RECOGNIZE with CTE in Snowflake

这在当时似乎是Snowflake的一个未记录在案的限制。两个或三个步骤的解决方案对我很有效：

with clean_data as (
    -- Remove duplicate entries
    select distinct id, timestamp, measurement
    from dataset
)

select *
from clean_data
    match_recognize (
        partition by id
        order by timestamp
        measures
            match_number() as event_number
        all rows per match
        after match skip past last row
        pattern(any_row row_between_gaps+)
        define
            -- Classify contiguous sections of datapoints with < 20min between adjacent points.
            row_between_gaps as datediff(minute, lag(timestamp), timestamp) < 20
    );

set quid=last_query_id();

with label_events as (
    select *
    from table(result_scan($quid))
)

-- Do binning with width_bucket/etc. here
select id, timestamp, measurement, event_number
from label_events;

我更喜欢在这里使用变量，因为我可以在开发/调试期间多次重新运行第二个查询，而不必重新运行第一个查询。

还需要注意的是，Snowflake中缓存的地理对象会转换为GEOJSON，因此在使用result_scan检索这些对象时，必须将其类型转换回地理类型。

这篇关于在Snowflake的公用表表达式中使用&amp；Match_Recognition&amp；的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Snowflake的公用表表达式中使用&amp；Match_Recognition&amp； [英] Using &quot;match_recognize&quot; in a Common Table Expression in Snowflake

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在Snowflake的公用表表达式中使用&amp；Match_Recognition&amp； [英] Using &amp;quot;match_recognize&amp;quot; in a Common Table Expression in Snowflake

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

在Snowflake的公用表表达式中使用&amp；Match_Recognition&amp； [英] Using "match_recognize" in a Common Table Expression in Snowflake

登录关闭