具有任意精度(低至毫秒)的 Postgresql SQL GROUP BY 时间间隔 [英] Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds)

查看:33
本文介绍了具有任意精度(低至毫秒)的 Postgresql SQL GROUP BY 时间间隔的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将测量数据存储在以下结构中:

I have my measurement data stored into the following structure:

CREATE TABLE measurements(
measured_at TIMESTAMPTZ,
val INTEGER
);

已经知道使用

(a) date_trunc('hour',measured_at)

(b) generate_series

我可以通过以下方式汇总我的数据:

I would be able to aggregate my data by:

microseconds,
milliseconds
.
.
.

但是是否可以将数据聚合 5 分钟或任意数量的秒数?是否可以以任意倍数的秒数聚合测量数据?

But is it possible to aggregate the data by 5 minutes or let's say an arbitrary amount of seconds? Is it possible to aggregate measured data by an arbitrary multiple of seconds?

我需要按不同时间分辨率汇总的数据将它们输入 FFT 或 AR 模型,以便查看可能的季节性.

I need the data aggregated by different time resolutions to feed them into a FFT or an AR-Model in order to see possible seasonalities.

推荐答案

您可以通过添加由 generate_series() 创建的间隔来生成桶"表.此 SQL 语句将在您的数据中生成第一天的五分钟桶表(min(measured_at) 的值).

You can generate a table of "buckets" by adding intervals created by generate_series(). This SQL statement will generate a table of five-minute buckets for the first day (the value of min(measured_at)) in your data.

select 
  (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,
  (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_time
from generate_series(0, (24*60), 5) n

那个语句包装在一个公用表表达式中,您可以将其连接和分组,就好像它是一个基表一样.

Wrap that statement in a common table expression, and you can join and group on it as if it were a base table.

with five_min_intervals as (
  select 
    (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,
    (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_time
  from generate_series(0, (24*60), 5) n
)
select f.start_time, f.end_time, avg(m.val) avg_val 
from measurements m
right join five_min_intervals f 
        on m.measured_at >= f.start_time and m.measured_at < f.end_time
group by f.start_time, f.end_time
order by f.start_time

按任意秒数分组是类似的——使用date_trunc().

Grouping by an arbitrary number of seconds is similar--use date_trunc().

generate_series() 的更一般用途可让您避免猜测五分钟时间段的上限.在实践中,您可能会将其构建为视图或函数.您可能会从基表中获得更好的性能.

A more general use of generate_series() lets you avoid guessing the upper limit for five-minute buckets. In practice, you'd probably build this as a view or a function. You might get better performance from a base table.

select 
  (select min(measured_at)::date from measurements) + ( n    || ' minutes')::interval start_time,
  (select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_time
from generate_series(0, ((select max(measured_at)::date - min(measured_at)::date from measurements) + 1)*24*60, 5) n;

这篇关于具有任意精度(低至毫秒)的 Postgresql SQL GROUP BY 时间间隔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆