在任何表中查找重复行的 SQL 查询 [英] SQL query to find duplicate rows, in any table

查看:27
本文介绍了在任何表中查找重复行的 SQL 查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找与架构无关的查询.也就是说,如果我有一个 users 表或一个 purchases 表,则查询应该同样能够在没有任何修改的情况下捕获任一表中的重复行(from 子句,当然).

I'm looking for a schema-independent query. That is, if I have a users table or a purchases table, the query should be equally capable of catching duplicate rows in either table without any modification (other than the from clause, of course).

我正在使用 T-SQL,但我猜应该有一个通用的解决方案.

I'm using T-SQL, but I'm guessing there should be a general solution.

推荐答案

我相信这对你有用.请记住,CHECKSUM() 不是 100% 完美的 - 理论上有可能在这里得到误报(我认为),但否则您可以更改表名,这应该可以工作:

I believe that this should work for you. Keep in mind that CHECKSUM() isn't 100% perfect - it's theoretically possible to get a false positive here (I think), but otherwise you can just change the table name and this should work:

;WITH cte AS (
    SELECT
        *,
        CHECKSUM(*) AS chksum,
        ROW_NUMBER() OVER(ORDER BY GETDATE()) AS row_num
    FROM
        My_Table
)
SELECT
    *
FROM
    CTE T1
INNER JOIN CTE T2 ON
    T2.chksum = T1.chksum AND
    T2.row_num <> T1.row_num

ROW_NUMBER() 是必需的,以便您可以通过某种方式区分行.它需要一个 ORDER BY 并且它不能是一个常量,所以 GETDATE() 是我的解决方法.

The ROW_NUMBER() is needed so that you have some way of distinguishing rows. It requires an ORDER BY and that can't be a constant, so GETDATE() was my workaround for that.

只需更改 CTE 中的表名,它应该可以在不拼出列的情况下工作.

Simply change the table name in the CTE and it should work without spelling out the columns.

这篇关于在任何表中查找重复行的 SQL 查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆