在PostgreSQL的表中添加缺少的日期 [英] adding missing date in a table in PostgreSQL
问题描述
我有一个表包含2002年的每一天的数据,但它有一些缺少的日期。即2002年354个记录(而不是365个)。对于我的计算,我需要在表中缺少数据为空值
+ ----- + ---------- + ------------ +
| ID |降雨|日期|
+ ----- + ------------ + ------------ +
| 100 | 110.2 | 2002-05-06 |
| 101 | 56.6 | 2002-05-07 |
| 102 | 65.6 | 2002-05-09 |
| 103 | 75.9 | 2002-05-10 |
+ ----- + ------------ + ------------ +
你看,2002-05-08缺少。我想要我的决赛桌像:
+ ----- + ---------- - + ------------ +
| ID |降雨|日期|
+ ----- + ------------ + ------------ +
| 100 | 110.2 | 2002-05-06 |
| 101 | 56.6 | 2002-05-07 |
| 102 | | 2002-05-08 |
| 103 | 65.6 | 2002-05-09 |
| 104 | 75.9 | 2002-05-10 |
+ ----- + ------------ + ------------ +
有没有办法在PostgreSQL中执行此操作?
如果我有结果只是一个查询结果(不一定是更新的表)
date
是标准SQL中的保留字,而PostgreSQL中数据类型的名称。 PostgreSQL允许它作为标识符,但这不是一个好主意。我使用 thedate
作为列名。
不要依赖代理ID中缺少空白。这几乎总是一个坏主意。即使在大部分时间内似乎都携带某些其他属性
在这种特殊情况下,如 @Clodoaldo commented < a>, thedate
似乎是一个完美的主键,列 id
只是cruft - 我删除了: / p>
CREATE TEMP TABLE tbl(date date PRIMARY KEY,rainfall numeric);
INSERT INTO tbl(thedate,rainfall)VALUES
('2002-05-06',110.2)
,('2002-05-07',56.6)
,( '2002-05-09',65.6)
,('2002-05-10',75.9);
查询
全表查询:
SELECT x.thedate,t.rainfall - 缺少行的降雨自动为NULL
FROM(
SELECT generate_series(min(thedate),max(thedate),'1d'):: date AS thedate
FROM tbl
)x
LEFT JOIN tbl t USING(thedate)
ORDER BY x.thedate
与 @ a_horse_with_no_name 发布,但简化并忽略了修剪的 id
。
填写表中第一个和最后一个日期之间的差距。如果可以有领先/滞后的差距,则进行相应调整。您可以使用 @Clodoaldo 展示的 date_trunc()
,但他的查询遇到几个语法错误,并且不必要地卷曲。
INSERT missing rows
最快最可读的方式这样做是一个 NOT EXISTS
反半加入。
INSERT INTO tbl(thedate,rainfall)
SELECT x.thedate,NULL
FROM(
SELECT generate_series(min(thedate),max(thedate),'1d'):: date AS thedate
FROM tbl
)x
WHERE NOT EXISTS(SELECT 1 FROM tbl t WHERE t.thedate = x.thedate)
I have a table that contains data for every day in 2002, but it has some missing dates. Namely, 354 records for 2002 (instead of 365). For my calculations, I need to have the missing data in the table with Null values
+-----+------------+------------+
| ID | rainfall | date |
+-----+------------+------------+
| 100 | 110.2 | 2002-05-06 |
| 101 | 56.6 | 2002-05-07 |
| 102 | 65.6 | 2002-05-09 |
| 103 | 75.9 | 2002-05-10 |
+-----+------------+------------+
you see that 2002-05-08 is missing. I want my final table to be like:
+-----+------------+------------+
| ID | rainfall | date |
+-----+------------+------------+
| 100 | 110.2 | 2002-05-06 |
| 101 | 56.6 | 2002-05-07 |
| 102 | | 2002-05-08 |
| 103 | 65.6 | 2002-05-09 |
| 104 | 75.9 | 2002-05-10 |
+-----+------------+------------+
Is there a way to do that in PostgreSQL?
It doesn't matter if I have the result just as a query result (not necessarily an updated table)
date
is a reserved word in standard SQL and the name of a data type in PostgreSQL. PostgreSQL allows it as identifier, but that doesn't make it a good idea. I use thedate
as column name instead.
Don't rely on the absence of gaps in a surrogate ID. That's almost always a bad idea. Treat such an ID as unique number without meaning, even if it seems to carry certain other attributes most of the time.
In this particular case, as @Clodoaldo commented, thedate
seems to be a perfect primary key and the column id
is just cruft - which I removed:
CREATE TEMP TABLE tbl (thedate date PRIMARY KEY, rainfall numeric);
INSERT INTO tbl(thedate, rainfall) VALUES
('2002-05-06', 110.2)
, ('2002-05-07', 56.6)
, ('2002-05-09', 65.6)
, ('2002-05-10', 75.9);
Query
Full table by query:
SELECT x.thedate, t.rainfall -- rainfall automatically NULL for missing rows
FROM (
SELECT generate_series(min(thedate), max(thedate), '1d')::date AS thedate
FROM tbl
) x
LEFT JOIN tbl t USING (thedate)
ORDER BY x.thedate
Similar to what @a_horse_with_no_name posted, but simplified and ignoring the pruned id
.
Fills in gaps between first and last date found in the table. If there can be leading / lagging gaps, adapt accordingly. You can use date_trunc()
like @Clodoaldo demonstrated - but his query suffers from several syntax errors and is needlessly convoluted.
INSERT missing rows
The fastest and most readable way to do it is a NOT EXISTS
anti-semi-join.
INSERT INTO tbl (thedate, rainfall)
SELECT x.thedate, NULL
FROM (
SELECT generate_series(min(thedate), max(thedate), '1d')::date AS thedate
FROM tbl
) x
WHERE NOT EXISTS (SELECT 1 FROM tbl t WHERE t.thedate = x.thedate)
这篇关于在PostgreSQL的表中添加缺少的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!