按期末而不是开始日期分组 [英] Group by end of period instead of start date

查看:76
本文介绍了按期末而不是开始日期分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望在数据集的结束日期前聚合某个数据,该日期有一个前置时间段而不是开始时间。例如,我想查询一个表,并返回结果中显示的日期之前30天之前匹配结果的计数。原始表仅包含销售日期(时间戳)。示例:

I'm looking to aggregate data by the end date of a dataset with some leading period rather than the start. For example, I want to query a table and return the count of matching results 30 days PRIOR to the end date of the date shown in the results. The original table would contain ONLY the date a sale was made (timestamp). Example:

sales_timestamp
------------------
2015-08-05 12:00:00
2015-08-06 13:00:00
2015-08-25 12:31:00
2015-08-26 01:02:00
2015-08-27 02:03:00
2015-08-29 04:23:00
2015-09-01 12:00:00
2015-09-02 12:00:00
2015-09-08 00:00:00

结果查询输出的示例为:

date_period  |   count_of_sales
--------------------------------
2015-08-24   |        2
2015-08-31   |        6
2015-09-07   |        6

其中date_period为2015-09-07将暗示该公司在30在2015年9月7日结束的几天(如果是30天,则从2015年8月7日开始)。

in which the date_period of 2015-09-07 would imply the company sold 6 items in the 30 days ENDING on 9/7/2015 (and starting ~8/7/2015 if a true 30 day period).

我一直在玩 date_trunc()函数,但似乎无法将截断应用于结束日期,而不是按开始分组。

I've been toying with variations of the date_trunc() function but can't seem to get the truncation to apply on the end date rather than grouping by the start.

此数据将存储在PostgreSQL 9.1中。

This data would be housed on PostgreSQL 9.1.

推荐答案

此查询可满足您的所有要求

This query does all you ask for:

SELECT day::date AS date_period, count_of_sales
FROM (
   SELECT *, sum(ct) OVER (ORDER BY day ROWS 30 PRECEDING) AS count_of_sales
   FROM   generate_series(date '2015-08-24' - 30  -- start 30 days earlier
                        , date '2015-09-07'
                        , interval '1 day') day
   LEFT JOIN (
      SELECT date_trunc('day', sales_timestamp) AS day, count(*)::int AS ct
      FROM   sales
      GROUP  BY 1
      ) s USING (day)
   ) sub
JOIN  generate_series(date '2015-08-24'
                    , date '2015-09-07 '
                    , interval '1 week') day USING (day);

SQL提琴。


  1. 生成全套相关日期(第一个 generate_series()

  2. 加入到每天的总计数。 LEFT保证每天一个行,这使我们能够根据行数使用窗口函数。

  3. 使用 sum ()作为窗口汇总函数,且自定义框架的日期为30天。 (您可能想改用29,尚不清楚如何计算。)

  1. Generate a full set of relevant days (1st generate_series())
  2. LEFTJOIN to the aggregated counts per day. The LEFT guarantees one row per day, which allows us to use window functions based on the row count.
  3. Use sum() as window aggregate function with a custom frame of 30 days preceding. (You may want to use 29 instead, it's unclear how you count.)

将结果加入您想要的结果的实际天数。 (第二个 generate_series()每周有一天)。

Join the result to actual days you want in the result. (2nd generate_series() with one day per week).

请注意,如果使用 timestamptz ,则天的定义是从会话的当前时区设置派生的。结果在不同的时区可能会有所不同。不适用于时间戳,它不取决于当前时区。基础知识:

Be aware that the definition of "day" is derived from the current time zone setting of your session if you work with timestamptz. Results can be different in different time zones. Does not apply for just timestamp, which does not depend on the current time zone. Basics:

  • Ignoring timezones altogether in Rails and PostgreSQL

与答案相关的答案具有自定义框架定义的窗口函数:

Related answer with explanation for the window function with custom frame definition:

  • Select finishes where athlete didn't finish first for the past 3 events

这篇关于按期末而不是开始日期分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆