在generate_series()上加入计数查询,并将空值检索为"0" [英] Join a count query on generate_series() and retrieve Null values as '0'

查看:70
本文介绍了在generate_series()上加入计数查询,并将空值检索为"0"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 generate_series()计算每月的ID.此查询在PostgreSQL 9.1中有效:

I want to count ID's per month using generate_series(). This query works in PostgreSQL 9.1:

SELECT (to_char(serie,'yyyy-mm')) AS year, sum(amount)::int AS eintraege FROM (
    SELECT  
       COUNT(mytable.id) as amount,   
       generate_series::date as serie   
       FROM mytable  
    
    RIGHT JOIN generate_series(     
       (SELECT min(date_from) FROM mytable)::date,   
       (SELECT max(date_from) FROM mytable)::date,  
       interval '1 day') ON generate_series = date(date_from)  
       WHERE version = 1   
       GROUP BY generate_series       
       ) AS foo
GROUP BY Year   
ORDER BY Year ASC;  

这是我的输出:

"2006-12" | 4  
"2007-02" | 1  
"2007-03" | 1  

但是我想要得到的是此输出(一月份的'0'值):

But what I want to get is this output ('0' value in January):

"2006-12" | 4  
"2007-01" | 0  
"2007-02" | 1  
"2007-03" | 1  

但是,应该列出

没有 id 的月份.
有什么想法可以解决这个问题吗?

Months without id should be listed nevertheless.
Any ideas how to solve this?

样本数据:

drop table if exists mytable;
create table mytable(id bigint, version smallint, date_from timestamp);
insert into mytable(id, version, date_from) values
(4084036, 1, '2006-12-22 22:46:35'),
(4084938, 1, '2006-12-23 16:19:13'),
(4084938, 2, '2006-12-23 16:20:23'),
(4084939, 1, '2006-12-23 16:29:14'),
(4084954, 1, '2006-12-23 16:28:28'),
(4250653, 1, '2007-02-12 21:58:53'),
(4250657, 1, '2007-03-12 21:58:53')
;

推荐答案

解开,简化和固定,看起来可能像这样:

Untangled, simplified and fixed, it might look like this:

SELECT to_char(s.tag,'yyyy-mm') AS monat
     , count(t.id) AS eintraege
FROM  (
   SELECT generate_series(min(date_from)::date
                        , max(date_from)::date
                        , interval '1 day'
          )::date AS tag
   FROM   mytable t
   ) s
LEFT   JOIN mytable t ON t.date_from::date = s.tag AND t.version = 1   
GROUP  BY 1
ORDER  BY 1;

db<>小提琴此处

在所有噪音,误导性标识符和非常规格式中,实际问题隐藏在这里:

Among all the noise, misleading identifiers and unconventional format the actual problem was hidden here:

WHERE version = 1

您正确使用了

You made correct use of RIGHT [OUTER] JOIN. But adding a WHERE clause that requires an existing row from mytable converts the RIGHT [OUTER] JOIN to an [INNER] JOIN effectively.

将过滤器移至 JOIN 条件以使其起作用.

Move that filter into the JOIN condition to make it work.

我简化了一些其他事情.

I simplified some other things while being at it.

SELECT to_char(mon, 'yyyy-mm') AS monat
     , COALESCE(t.ct, 0) AS eintraege
FROM  (
   SELECT date_trunc('month', date_from)::date AS mon
        , count(*) AS ct
   FROM   mytable
   WHERE  version = 1     
   GROUP  BY 1
   ) t
RIGHT JOIN (
   SELECT generate_series(date_trunc('month', min(date_from))
                        , max(date_from)
                        , interval '1 mon')::date
   FROM   mytable
   ) m(mon) USING (mon)
ORDER  BY mon;

db<>小提琴此处

首先聚合然后再加入要便宜得多-每月加入一排,而不是每天加入一排.

It's much cheaper to aggregate first and join later - joining one row per month instead of one row per day.

GROUP BY ORDER BY 基于 date 值而不是呈现的 text 便宜.

It's cheaper to base GROUP BY and ORDER BY on the date value instead of the rendered text.

count(*) count(id)快一点,而在 this 查询中等效.

count(*) is a bit faster than count(id), while equivalent in this query.

generate_series()更快,更安全.参见:

generate_series() is a bit faster and safer when based on timestamp instead of date. See:

这篇关于在generate_series()上加入计数查询,并将空值检索为"0"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆