对表进行重复数据删除的最佳方法是什么? [英] What's the best way to dedupe a table?

查看:22
本文介绍了对表进行重复数据删除的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经看到了一些解决方案,但我想知道最好和最有效的方法是对表进行重复数据删除.您可以使用代码(SQL 等)来说明您的观点,但我只是在寻找基本算法.我以为在 SO 上已经有关于此的问题,但我找不到,所以如果它已经存在,请提醒我.

I've seen a couple of solutions for this, but I'm wondering what the best and most efficient way is to de-dupe a table. You can use code (SQL, etc.) to illustrate your point, but I'm just looking for basic algorithms. I assumed there would already be a question about this on SO, but I wasn't able to find one, so if it already exists just give me a heads up.

(澄清一下 - 我指的是删除具有增量自动 PK 并且在除 PK 字段之外的所有内容中都有一些重复的表中的重复项.)

(Just to clarify - I'm referring to getting rid of duplicates in a table that has an incremental automatic PK and has some rows that are duplicates in everything but the PK field.)

推荐答案

使用解析函数row_number:

Using analytic function row_number:

WITH CTE (col1, col2, dupcnt)
AS
(
SELECT col1, col2,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col1) AS dupcnt
FROM Youtable
)
DELETE
FROM CTE
WHERE dupcnt > 1
GO                                                                 

这篇关于对表进行重复数据删除的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆