在PostgreSQL的表中添加缺少的日期 [英] adding missing date in a table in PostgreSQL

查看:236
本文介绍了在PostgreSQL的表中添加缺少的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表包含2002年的每一天的数据,但它有一些缺少的日期。即2002年354个记录(而不是365个)。对于我的计算,我需要在表中缺少数据为空值

  + ----- + ---------- + ------------ + 
| ID |降雨|日期|
+ ----- + ------------ + ------------ +
| 100 | 110.2 | 2002-05-06 |
| 101 | 56.6 | 2002-05-07 |
| 102 | 65.6 | 2002-05-09 |
| 103 | 75.9 | 2002-05-10 |
+ ----- + ------------ + ------------ +

你看,2002-05-08缺少。我想要我的决赛桌像:

  + ----- + ---------- -  + ------------ + 
| ID |降雨|日期|
+ ----- + ------------ + ------------ +
| 100 | 110.2 | 2002-05-06 |
| 101 | 56.6 | 2002-05-07 |
| 102 | | 2002-05-08 |
| 103 | 65.6 | 2002-05-09 |
| 104 | 75.9 | 2002-05-10 |
+ ----- + ------------ + ------------ +

有没有办法在PostgreSQL中执行此操作?



如果我有结果只是一个查询结果(不一定是更新的表)

解决方案

date 是标准SQL中的保留字,而PostgreSQL中数据类型的名称。 PostgreSQL允许它作为标识符,但这不是一个好主意。我使用 thedate 作为列名。



不要依赖代理ID中缺少空白。这几乎总是一个坏主意。即使在大部分时间内似乎都携带某些其他属性

在这种特殊情况下,如 @Clodoaldo commented < a>, thedate 似乎是一个完美的主键,列 id 只是cruft - 我删除了: / p>

  CREATE TEMP TABLE tbl(date date PRIMARY KEY,rainfall numeric); 
INSERT INTO tbl(thedate,rainfall)VALUES
('2002-05-06',110.2)
,('2002-05-07',56.6)
,( '2002-05-09',65.6)
,('2002-05-10',75.9);



查询



全表查询:

  SELECT x.thedate,t.rainfall  - 缺少行的降雨自动为NULL 
FROM(
SELECT generate_series(min(thedate),max(thedate),'1d'):: date AS thedate
FROM tbl
)x
LEFT JOIN tbl t USING(thedate)
ORDER BY x.thedate

@ a_horse_with_no_name 发布,但简化并忽略了修剪的 id



填写表中第一个和最后一个日期之间的差距。如果可以有领先/滞后的差距,则进行相应调整。您可以使用 @Clodoaldo 展示的 date_trunc(),但他的查询遇到几个语法错误,并且不必要地卷曲。



INSERT missing rows



最快最可读的方式这样做是一个 NOT EXISTS 反半加入。

  INSERT INTO tbl(thedate,rainfall)
SELECT x.thedate,NULL
FROM(
SELECT generate_series(min(thedate),max(thedate),'1d'):: date AS thedate
FROM tbl
)x
WHERE NOT EXISTS(SELECT 1 FROM tbl t WHERE t.thedate = x.thedate)


I have a table that contains data for every day in 2002, but it has some missing dates. Namely, 354 records for 2002 (instead of 365). For my calculations, I need to have the missing data in the table with Null values

+-----+------------+------------+
| ID  |  rainfall  | date       |
+-----+------------+------------+
| 100 |  110.2     | 2002-05-06 |
| 101 |  56.6      | 2002-05-07 |
| 102 |  65.6      | 2002-05-09 |
| 103 |  75.9      | 2002-05-10 |
+-----+------------+------------+

you see that 2002-05-08 is missing. I want my final table to be like:

+-----+------------+------------+
| ID  |  rainfall  | date       |
+-----+------------+------------+
| 100 |  110.2     | 2002-05-06 |
| 101 |  56.6      | 2002-05-07 |
| 102 |            | 2002-05-08 |
| 103 |  65.6      | 2002-05-09 |
| 104 |  75.9      | 2002-05-10 |
+-----+------------+------------+

Is there a way to do that in PostgreSQL?

It doesn't matter if I have the result just as a query result (not necessarily an updated table)

解决方案

date is a reserved word in standard SQL and the name of a data type in PostgreSQL. PostgreSQL allows it as identifier, but that doesn't make it a good idea. I use thedate as column name instead.

Don't rely on the absence of gaps in a surrogate ID. That's almost always a bad idea. Treat such an ID as unique number without meaning, even if it seems to carry certain other attributes most of the time.

In this particular case, as @Clodoaldo commented, thedate seems to be a perfect primary key and the column id is just cruft - which I removed:

CREATE TEMP TABLE tbl (thedate date PRIMARY KEY, rainfall numeric);
INSERT INTO tbl(thedate, rainfall) VALUES
  ('2002-05-06', 110.2)
, ('2002-05-07', 56.6)
, ('2002-05-09', 65.6)
, ('2002-05-10', 75.9);

Query

Full table by query:

SELECT x.thedate, t.rainfall  -- rainfall automatically NULL for missing rows
FROM (
   SELECT generate_series(min(thedate), max(thedate), '1d')::date AS thedate
   FROM   tbl
   ) x
LEFT   JOIN tbl t USING (thedate)
ORDER  BY x.thedate

Similar to what @a_horse_with_no_name posted, but simplified and ignoring the pruned id.

Fills in gaps between first and last date found in the table. If there can be leading / lagging gaps, adapt accordingly. You can use date_trunc() like @Clodoaldo demonstrated - but his query suffers from several syntax errors and is needlessly convoluted.

INSERT missing rows

The fastest and most readable way to do it is a NOT EXISTS anti-semi-join.

INSERT INTO tbl (thedate, rainfall)
SELECT x.thedate, NULL
FROM (
   SELECT generate_series(min(thedate), max(thedate), '1d')::date AS thedate
   FROM   tbl
   ) x
WHERE NOT EXISTS (SELECT 1 FROM tbl t WHERE t.thedate = x.thedate)

这篇关于在PostgreSQL的表中添加缺少的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆