SQL-基于2个值删除重复项 [英] SQL - Remove Duplicates based on 2 values
本文介绍了SQL-基于2个值删除重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有2个具有相同表结构的Postgresql数据库.出于报告目的,我将这些表的数据推送到单个Google BigQuery表中.
I have 2 Postgresql databases with the same table structure. For reporting purposes, I'm pushing these table's data into a single Google BigQuery table.
在报表上,我有一个名为databaseID的列,它指示源数据库是什么.
On the reporting table, I have a column called databaseID which indicates what is the source database.
databaseID - 1(1st postgres table)
databaseID - 2 (2nd postgres table)
现在,每次将增量数据附加到报表中时.因此,这两个数据源都有重复项.
Now everytime I'm appending the incremental data to the reporting table. So it has duplicates for both data sources.
id name DatabaseID updated_date
1 aaa 1 2020-12-01
2 ccc 1 2020-12-01
1 vvv 1 2021-01-05
1 qqq 2 2020-12-01
2 www 2 2020-12-01
2 aaa 2 2021-01-05
3 xxx 2 2020-12-01
我必须为两个数据库ID重复删除此数据.我不确定用于此的SQL逻辑.
I have to de-duplicate this data for both the database IDs. I'm not sure about the SQL logic for this.
id name DatabaseID updated_date
2 ccc 1 2020-12-01
1 vvv 1 2021-01-05
1 qqq 2 2020-12-01
2 aaa 2 2021-01-05
3 xxx 2 2020-12-01
推荐答案
能否请您尝试以下操作:
Could you please try something like this:
WITH CTE(ID,NAME,DATABASEID,UPDATED_DATE) AS
(
SELECT 1,'AAA',1,'2020-12-01'
UNION ALL
SELECT 2,'CCC',1,'2020-12-01'
UNION ALL
SELECT 1,'VVV',1,'2021-01-05'
UNION ALL
SELECT 1,'QQQ',2,'2020-12-01'
UNION ALL
SELECT 2,'WWW',2,'2020-12-01'
UNION ALL
SELECT 2,'AAA',2,'2021-01-05'
UNION ALL
SELECT 3,'XXX',2,'2020-12-01'
)
SELECT X.ID,X.NAME,X.DATABASEID,X.UPDATED_DATE FROM
(
SELECT C.ID,C.NAME,C.DATABASEID,C.UPDATED_DATE,
ROW_NUMBER()OVER(PARTITION BY C.ID,C.DATABASEID ORDER BY C.UPDATED_DATE DESC)XCOL
FROM CTE AS C
)X WHERE X.XCOL=1;
这篇关于SQL-基于2个值删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文