平均间隔内的多个平均值 [英] Multiple averages over evenly spaced intervals

查看：99 发布时间：2020/5/29 21:33:50 sql postgresql aggregate-functions average window-functions

本文介绍了平均间隔内的多个平均值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试学习SQL，所以请耐心等待。我正在使用PostgreSQL 9.3

I'm trying to learn SQL so be patient with me. I'm using PostgreSQL 9.3

我想根据日期窗口对一列取平均值。我已经能够编写通过设置 interval 来完成此操作的窗口函数，但我希望能够随着<$ c $的增长而做到这一点。 c>时间间隔。我的意思是：

I want to average a column based on a window of dates. I've been able to write window functions that accomplish this with a set interval but I'd like to be able to be able to do this with a growing interval. By this I mean:

average values from date_0 to date_1
average values from date_0 to date_2
average values from date_0 to date_3
..... so date date_0 stays the same and date_x grows and creates a larger sample

我假设有一个比对我要平均的每个范围运行查询更好的方法。任何建议表示赞赏。谢谢。

I'm assuming there is a better way than running a query for each range I'd like to average. Any advice is appreciated. Thank you.

我正在尝试创建均匀分布的垃圾箱，以用于汇总

我进入间隔的时间是：

I'm trying to create evenly spaced bins to be used to aggregate values of a table over.
I come to the interval by:

(MAX(date) - MIN(date)) / bins

其中 date 是表的列

和 bins 是我想将表分为的箱数。

where date is the column of a table
and bins is the number of bins I'd like to divide the table into.

date_0 = MIN（date）

date_n = MIN（日期）+（间隔* n）

date_0 = MIN(date)
date_n = MIN(date) + (interval * n)

推荐答案

我建议使用便捷的功能 width_bucket（） ：

I suggest the handy function width_bucket():

获取每个时间段（ bin）的平均值：

To get the average for each time segment ("bin"):

SELECT width_bucket(extract(epoch FROM t.the_date)
                  , x.min_epoch, x.max_epoch, x.bins) AS bin
     , avg(value) AS bin_avg
FROM   tbl t
    , (SELECT extract(epoch FROM min(the_date)) AS min_epoch
            , extract(epoch FROM max(the_date)) AS max_epoch
            , 10 AS bins
       FROM   tbl t
      ) x
GROUP  BY 1;

要获得（逐步）增长时间间隔内的运行平均值：

To get the "running average" over the (step-by-step) growing time interval:

SELECT bin, round(sum(bin_sum) OVER w /sum(bin_ct) OVER w, 2) AS running_avg
FROM  (
   SELECT width_bucket(extract(epoch FROM t.the_date)
                     , x.min_epoch, x.max_epoch, x.bins) AS bin
        , sum(value) AS bin_sum
        , count(*)   AS bin_ct
   FROM   tbl t
       , (SELECT extract(epoch FROM min(the_date)) AS min_epoch
               , extract(epoch FROM max(the_date)) AS max_epoch
               , 10 AS bins
          FROM   tbl t
         ) x
   GROUP  BY 1
   ) sub
WINDOW w AS (ORDER BY bin)
ORDER  BY 1;

使用 the_date 代替 date 作为列名，避免使用保留的单词作为标识符。

由于 width_bucket（）当前仅以双精度和数字，我从 the_date 提取纪元值。此处的详细信息：聚合（x， y）协调PostgreSQL中的点云


Using the_date instead of date as column name, avoiding reserved words as identifiers.

Since width_bucket() is currently only implemented for double precision and numeric, I extract epoch values from the_date. Details here:

Aggregating (x,y) coordinate point clouds in PostgreSQL

                        这篇关于平均间隔内的多个平均值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

平均间隔内的多个平均值 [英] Multiple averages over evenly spaced intervals

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

平均间隔内的多个平均值 [英] Multiple averages over evenly spaced intervals

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭