具有不同时间戳的重复条目 [英] Duplicate entries with different timestamp
问题描述
我有一个按名称命名的客户表:Customer_SCD
in SQL我有 3 列:Customer_Name
, Customer_ID
Customer_TimeStamp
I have a Table for Customers by name : Customer_SCD
in SQL
I have 3 Columns present in it : Customer_Name
, Customer_ID
Customer_TimeStamp
此表中存在具有不同时间戳的重复条目.
There are duplicate entries in this table with different Timestamp.
例如
ABC, 1, 2012-12-05 11:58:20.370
ABC, 1, 2012-12-03 12:11:09.840
我想从数据库中删除它并保留第一时间/日期可用.
I want to eliminate this from the database and keep the first time/date available.
谢谢.
推荐答案
这个有效,试试看:
DELETE Customer_SCD
OUTPUT deleted.*
FROM Customer_SCD b
JOIN (
SELECT MIN(a.Customer_TimeStamp) Customer_TimeStamp,
Customer_ID,
Customer_Name
FROM Customer_SCD a
GROUP BY a.Customer_ID, a.Customer_Name
) c ON
c.Customer_ID = b.Customer_ID
AND c.Customer_Name = b.Customer_Name
AND c.Customer_TimeStamp <> b.Customer_TimeStamp
在子查询中,它确定每个 Customer_Name
,Customer_ID
的第一个记录,然后删除所有其他记录的重复记录.我还添加了 OUTPUT
子句,它返回受语句影响的行.
In a subquery it determines which record is the first one for every Customer_Name
,Customer_ID
and then it deletes all the other records for a duplicate. I also added the OUTPUT
clause which returns rows affected by the statement.
您也可以使用排名函数ROW_NUMBER
:
You could also do it by using ranking function ROW_NUMBER
:
DELETE Customer_SCD
OUTPUT deleted.*
FROM Customer_SCD b
JOIN (
SELECT Customer_ID,
Customer_Name,
Customer_TimeStamp,
ROW_NUMBER() OVER (PARTITION BY Customer_ID, Customer_Name ORDER BY Customer_TimeStamp) num
FROM Customer_SCD
) c ON
c.Customer_ID = b.Customer_ID
AND c.Customer_Name = b.Customer_Name
AND c.Customer_TimeStamp = b.Customer_TimeStamp
AND c.num <> 1
看看哪个查询成本更小并使用它,当我检查它时,第一种方法更有效(它有更好的执行计划).
See which one has a smaller query cost and use it, when I checked it, first approach was more efficient (it had a better execution plan).
这是一个 SQL Fiddle
Here's an SQL Fiddle
这篇关于具有不同时间戳的重复条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!