如何更新引用重复记录的表? [英] How do I update a table that references duplicate records?
问题描述
我有两个SQL表.一个可以从另一张表中获取参考值,该表存储模块列表及其ID.但是这些描述不是唯一的.我正在尝试删除表A的重复项,但不确定如何更新表B以仅引用单个值.
I have two SQL tables. One gets a reference value from another table which stores a list of Modules and their ID. But these descriptions are not unique. I am trying to remove the duplicates of Table A but I'm not sure how to update Table B to only reference the single values.
示例:
Table A: Table B:
-------------------------------- ------------------------------------
ID Description RefID ID Name
-------------------------------- ------------------------------------
1 Test 1 2 1 QuickReports
-------------------------------- ------------------------------------
2 Test 2 1 2 QuickReports
-------------------------------- ------------------------------------
我希望结果如下:
Table A: Table B:
-------------------------------- ------------------------------------
ID Description RefID ID Name
-------------------------------- ------------------------------------
1 Test 1 1 1 QuickReports
-------------------------------- ------------------------------------
2 Test 2 1
--------------------------------
我使用以下代码设法从表B中删除了重复项,但是我无法更新表A中的记录.每个表都有500多个记录.
I managed to delete duplicates from table B using the below code but I haven't been able to update the records in Table A. Each table have over 500 records each.
WITH cte AS(
SELECT
Name,
ROW_NUMBER() OVER (
PARTITION BY
Name
ORDER BY
Name
)row_num
FROM ReportmodulesTest
)
DELETE FROM cte
WHERE row_num > 1;
推荐答案
我找到了使此过程更容易的解决方案.我首先使用 Row_Number
在表A中查找重复项,然后在 SELECT INTO
中创建临时表.
I found a solution that has made this process easier. I first use Row_Number
to find duplicates in Table A and SELECT INTO
a temporary table.
SELECT
a.Id
, a.Name
, ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Id DESC) RN
INTO
#TestTable
FROM
TableA a WITH(NOLOCK)
然后我 JOIN
表A和表B来查看ID的匹配位置,并确定需要保留的ID和要删除的ID:
I then JOIN
Table A and Table B to see where the ID's match and identify which ID I need to keep and which ID's I need to delete:
SELECT
b.Id
, b.Name
, b.RefId
, ToKeep.Id KeepId
, ToDelete.Id DeleteId
FROM
#TestTable ToDelete
JOIN TableB b WITH(NOLOCK)
ON b.RefId = ToDelete.Id
JOIN #TestTable ToKeep
ON ToDelete.Name = ToKeep.Name
AND ToKeep.RN = 1
WHERE ToDelete.RN > 1
然后使用类似的语句,我只更新记录:
Then using a similar statement, I just update the records:
UPDATE b
SET
b.RefId = ToKeep.Id,
FROM #TestTable ToDelete
JOIN TableB b WITH(NOLOCK)
ON b.RefId = ToDelete.Id
JOIN #TestTable ToKeep
ON ToDelete.Name = ToKeep.Name
AND ToKeep.RN = 1
WHERE
ToDelete.RN > 1
最后,我现在可以删除重复的记录:
Lastly, I can now delete the duplicate records:
DELETE a
FROM #TestTable b
INNER JOIN TableA a
ON b.Id = a.Id
WHERE
b.RN > 1
此后,您可以使用相同的第一条 SELECT
语句来确保删除所有重复项.只需删除 SELECT INTO
语句即可.
After this, you can use the same first SELECT
statement to ensure that all duplicates are deleted. Just remove the SELECT INTO
statement.
感谢我的一位匿名同事提出的解决方案,希望这对在那里的人有所帮助.
Thanks to an anonymous colleague of mine for this solution and hope this helps someone out there.
这篇关于如何更新引用重复记录的表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!