Postgres - 如何返回0计数为缺失数据的行? [英] Postgres - how to return rows with 0 count for missing data?

查看:211
本文介绍了Postgres - 如何返回0计数为缺失数据的行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有数据不均匀(wrt日期)几年(2003-2008)。我想查询一组给定的开始和结束日期的数据,通过PostgreSQL 8.3( http://www.postgresql.org/docs/8.3/static/functions-datetime.html #FUNCTIONS-DATETIME-TRUNC )。



问题是某些查询在所需时间段内连续运行,

  select to_char(date_trunc('month',date),'YYYY-MM-DD'),count )
from some_table where category_id = 1 and entity_id = 77 and entity2_id = 115
and date< ='2008-12-06'and date> ='2007-12-01'group by
date_trunc('month',date)order by date_trunc('month',date);
to_char | count
------------ + -------
2007-12-01 | 64
2008-01-01 | 31
2008-02-01 | 14
2008-03-01 | 21
2008/04-01 | 28
2008-05-01 | 44
2008-06-01 | 100
2008-07-01 | 72
2008-08-01 | 91
2008-09-01 | 92
2008-10-01 | 79
2008-11-01 | 65
(12 rows)

但有些错过了一些时间间隔,

 选择to_char(date_trunc('month',date),'YYYY-MM-DD') ,count(distinct post_id)
from some_table where category_id = 1 and entity_id = 75 and entity2_id = 115
and date< ='2008-12-06'and date> ='2007-12- 01'group by
date_trunc('month',date)order by date_trunc('month',date);

to_char | count
------------ + -------

2007-12-01 | 2
2008-01-01 | 2
2008-03-01 | 1
2008-04-01 | 2
2008-06-01 | 1
2008-08-01 | 3
2008-10-01 | 2
(7 rows)

其中所需结果集为:

  to_char | count 
------------ + -------
2007-12-01 | 2
2008-01-01 | 2
2008-02-01 | 0
2008-03-01 | 1
2008-04-01 | 2
2008-05-01 | 0
2008-06-01 | 1
2008-07-01 | 0
2008-08-01 | 3
2008-09-01 | 0
2008-10-01 | 2
2008-11-01 | 0
(12 rows)

缺少条目的计数值为0.



我之前讨论过Stack Overflow,但是他们不能解决我的问题,因为我的分组时间是(日,周,月,季,年)之一,并决定在运行时由应用程序。所以像左连接一个日历表或序列表的方法不会帮助我猜。



我目前的解决方案是填补这些差距在Python

解决方案

您可以使用

创建过去一年(例如)的所有头几天的列表。

  date_trunc('month',(current_date  -  offs))as date 
from generate_series(0,365,28)as offs;
date
------------------------
2007-12-01 00:00:00 + 01
2008-01-01 00:00:00 + 01
2008-02-01 00:00:00 + 01
2008-03-01 00:00:00 + 01
2008-04-01 00:00:00 + 02
2008-05-01 00:00:00 + 02
2008-06-01 00:00:00 + 02
2008 -07-01 00:00:00 + 02
2008-08-01 00:00:00 + 02
2008-09-01 00:00:00 + 02
2008-10 -01 00:00:00 + 02
2008-11-01 00:00:00 + 01
2008-12-01 00:00:00 + 01

然后您可以加入该系列。


I have unevenly distributed data(wrt date) for a few years (2003-2008). I want to query data for a given set of start and end date, grouping the data by any of the supported intervals (day, week, month, quarter, year) in PostgreSQL 8.3 (http://www.postgresql.org/docs/8.3/static/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC).

The problem is that some of the queries give results continuous over the required period, as this one:

select to_char(date_trunc('month',date), 'YYYY-MM-DD'),count(distinct post_id) 
from some_table where category_id=1 and entity_id = 77  and entity2_id = 115 
and date <= '2008-12-06' and date >= '2007-12-01' group by 
date_trunc('month',date) order by date_trunc('month',date);
          to_char   | count 
        ------------+-------
         2007-12-01 |    64
         2008-01-01 |    31
         2008-02-01 |    14
         2008-03-01 |    21
         2008-04-01 |    28
         2008-05-01 |    44
         2008-06-01 |   100
         2008-07-01 |    72
         2008-08-01 |    91
         2008-09-01 |    92
         2008-10-01 |    79
         2008-11-01 |    65
        (12 rows)

but some of them miss some intervals because there is no data present, as this one:

select to_char(date_trunc('month',date), 'YYYY-MM-DD'),count(distinct post_id) 
from some_table where category_id=1 and entity_id = 75  and entity2_id = 115 
and date <= '2008-12-06' and date >= '2007-12-01' group by 
date_trunc('month',date) order by date_trunc('month',date);

        to_char   | count 
    ------------+-------

     2007-12-01 |     2
     2008-01-01 |     2
     2008-03-01 |     1
     2008-04-01 |     2
     2008-06-01 |     1
     2008-08-01 |     3
     2008-10-01 |     2
    (7 rows)

where the required resultset is:

  to_char   | count 
------------+-------
 2007-12-01 |     2
 2008-01-01 |     2
 2008-02-01 |     0
 2008-03-01 |     1
 2008-04-01 |     2
 2008-05-01 |     0
 2008-06-01 |     1
 2008-07-01 |     0
 2008-08-01 |     3
 2008-09-01 |     0
 2008-10-01 |     2
 2008-11-01 |     0
(12 rows)

A count of 0 for missing entries.

I have seen earlier discussions on Stack Overflow but they don't solve my problem it seems, since my grouping period is one of (day, week, month, quarter, year) and decided on runtime by the application. So an approach like left join with a calendar table or sequence table will not help I guess.

My current solution to this is to fill in these gaps in Python (in a Turbogears App) using the calendar module.

Is there a better way to do this.

解决方案

You can create the list of all first days of the last year (say) with

select distinct date_trunc('month', (current_date - offs)) as date 
from generate_series(0,365,28) as offs;
          date
------------------------
 2007-12-01 00:00:00+01
 2008-01-01 00:00:00+01
 2008-02-01 00:00:00+01
 2008-03-01 00:00:00+01
 2008-04-01 00:00:00+02
 2008-05-01 00:00:00+02
 2008-06-01 00:00:00+02
 2008-07-01 00:00:00+02
 2008-08-01 00:00:00+02
 2008-09-01 00:00:00+02
 2008-10-01 00:00:00+02
 2008-11-01 00:00:00+01
 2008-12-01 00:00:00+01

Then you can join with that series.

这篇关于Postgres - 如何返回0计数为缺失数据的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆