按日期分组,当count()不产生任何行时为0 [英] Grouping by date, with 0 when count() yields no lines
问题描述
我正在使用Postgresql 9,并且在不计算行数的情况下与计数和分组作斗争。
I'm using Postgresql 9 and I'm fighting with counting and grouping when no lines are counted.
让我们假设以下模式:
create table views {
date_event timestamp with time zone ;
event_id integer;
}
让我们想象以下内容:
2012-01-01 00:00:05 2
2012-01-01 01:00:05 5
2012-01-01 03:00:05 8
2012-01-01 03:00:15 20
I想要按小时分组,并计算行数。我希望可以检索以下内容:
I want to group by hour, and count the number of lines. I wish I could retrieve the following :
2012-01-01 00:00:00 1
2012-01-01 01:00:00 1
2012-01-01 02:00:00 0
2012-01-01 03:00:00 2
2012-01-01 04:00:00 0
2012-01-01 05:00:00 0
.
.
2012-01-07 23:00:00 0
我的意思是对于每个时间段,我都会计算表中日期对应的行数,否则,我将返回计数为零的行。
I mean that for each time range slot, I count the number of lines in my table whose date correspond, otherwise, I return a line with a count at zero.
以下内容绝对不起作用(仅产生计数行> 0的行)。
The following will definitely not work (will yeld only lines with counted lines > 0).
SELECT extract ( hour from date_event ),count(*)
FROM views
where date_event > '2012-01-01' and date_event <'2012-01-07'
GROUP BY extract ( hour from date_event );
请注意,我可能还需要按分钟,小时或按天或按天分组
Please note I might also need to group by minute, or by hour, or by day, or by month, or by year (multiple queries is possible of course).
我只能使用普通的旧sql,因为我的视图表可能很大(> 100M记录),我会尽量考虑性能。
I can only use plain old sql, and since my views table can be very big (>100M records), I try to keep performance in mind.
如何实现?
谢谢!
推荐答案
鉴于表中没有日期,您需要一种生成日期的方法。您可以使用 generate_series
函数:
Given that you don't have the dates in the table, you need a way to generate them. You can use the generate_series
function:
SELECT * FROM generate_series('2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts;
这将产生如下结果:
ts
---------------------
2012-01-01 00:00:00
2012-01-01 01:00:00
2012-01-01 02:00:00
2012-01-01 03:00:00
...
2012-01-07 21:00:00
2012-01-07 22:00:00
2012-01-07 23:00:00
(168 rows)
剩下的任务是使用外部联接将两个选择联接在一起:
The remaining task is to join the two selects using an outer join like this :
select extract ( day from ts ) as day, extract ( hour from ts ) as hour,coalesce(count,0) as count from
(
SELECT extract ( day from date ) as day , extract ( hour from date ) as hr ,count(*)
FROM sr
where date>'2012-01-01' and date <'2012-01-07'
GROUP BY extract ( day from date ) , extract ( hour from date )
) AS cnt
right outer join ( SELECT * FROM generate_series ( '2012-01-01'::timestamp, '2012-01-07 23:00', '1 hour') AS ts ) as dtetable on extract ( hour from ts ) = cnt.hr and extract ( day from ts ) = cnt.day
order by day,hour asc;
这篇关于按日期分组,当count()不产生任何行时为0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!