当您没有唯一的密钥时,如何删除sybase中的重复行? [英] How to delete duplicate rows in sybase, when you have no unique key?
问题描述
这里发布的最优雅的解决方案适用于SQL Server,但不适用于Sybase(在我的情况下为Sybase Anywhere 11)。我甚至发现一些Sybase相关的问题标记为SQL Server问题的重复,这并不有用。
我喜欢的一些解决方案的一个例子,但没有工作,是 WITH ... DELETE ...
构造。
我已经找到使用光标的工作解决方案-loops,但我希望它可能没有循环。
我希望一个漂亮,简单和快速的查询,只需删除除了一个确切的重复。
这里有一个测试框架:
IF OBJECT_ID('tempdb ..# TestTable')IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable(Column1 varchar(1),Column2 int);
INSERT INTO #TestTable VALUES('A',1);
INSERT INTO #TestTable VALUES('A',1); - duplicate
INSERT INTO #TestTable VALUES('A',1); - 重复
INSERT INTO #TestTable VALUES('A',2);
INSERT INTO #TestTable VALUES('B',1);
INSERT INTO #TestTable VALUES('B',2);
INSERT INTO #TestTable VALUES('B',2); - duplicate
INSERT INTO #TestTable VALUES('C',1);
INSERT INTO #TestTable VALUES('C',2);
SELECT * FROM #TestTable ORDER BY Column1,Column2;
DELETE<你的解决方案>
SELECT * FROM #TestTable ORDER BY Column1,Column2;
好的,现在我知道了 ROWID()
函数,可以轻松采用具有主键(PK)的表的解决方案。这首先选择要保留的所有行,然后删除剩余的行:
DELETE FROM #TestTable
FROM #TestTable
LEFT OUTER JOIN(
SELECT MIN(ROWID(#TestTable))rowid
FROM #TestTable
GROUP BY Column1,Column2
)AS KeepRows ON ROWID(#TestTable )= KeepRows.rowid
WHERE KeepRows.rowid IS NULL;
...或这个较短的变体怎么样?我喜欢!
DELETE FROM #TestTable
WHERE ROWID(#TestTable)NOT IN(
SELECT MIN (ROWID(#TestTable))
FROM #TestTable
GROUP BY Column1,Column2
);
在这篇文章最令我激动的是一个评论: NOT IN
可能会更慢。但是这是SQL服务器,有时优雅更重要:) - 我也认为这一切都取决于良好的索引。
无论如何,通常是糟糕的设计,有没有PK的表。您应该至少添加一个autoincID,如果这样做,您可以使用该ID而不是 ROWID()
函数,这是非标准扩展名Sybase(其他人也有)。
Yes, you can find similar questions numerous times, but: the most elegant solutions posted here, work for SQL Server, but not for Sybase (in my case Sybase Anywhere 11). I have even found some Sybase-related questions marked as duplicates for SQL Server questions, which doesn't help.
One example for solutions I liked, but didn't work, is the WITH ... DELETE ...
construct.
I have found working solutions using cursors or while-loops, but I hope it is possible without loops.
I hope for a nice, simple and fast query, just deleting all but one exact duplicate.
Here a little framework for testing:
IF OBJECT_ID( 'tempdb..#TestTable' ) IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE #TestTable (Column1 varchar(1), Column2 int);
INSERT INTO #TestTable VALUES ('A', 1);
INSERT INTO #TestTable VALUES ('A', 1); -- duplicate
INSERT INTO #TestTable VALUES ('A', 1); -- duplicate
INSERT INTO #TestTable VALUES ('A', 2);
INSERT INTO #TestTable VALUES ('B', 1);
INSERT INTO #TestTable VALUES ('B', 2);
INSERT INTO #TestTable VALUES ('B', 2); -- duplicate
INSERT INTO #TestTable VALUES ('C', 1);
INSERT INTO #TestTable VALUES ('C', 2);
SELECT * FROM #TestTable ORDER BY Column1,Column2;
DELETE <your solution here>
SELECT * FROM #TestTable ORDER BY Column1,Column2;
Ok, now that I know the ROWID()
function, solutions for tables with primary key (PK) can be easily adopted. This one first selects all rows to keep and then deletes the remaining ones:
DELETE FROM #TestTable
FROM #TestTable
LEFT OUTER JOIN (
SELECT MIN(ROWID(#TestTable)) rowid
FROM #TestTable
GROUP BY Column1, Column2
) AS KeepRows ON ROWID(#TestTable) = KeepRows.rowid
WHERE KeepRows.rowid IS NULL;
...or how about this shorter variant? I like!
DELETE FROM #TestTable
WHERE ROWID(#TestTable) NOT IN (
SELECT MIN(ROWID(#TestTable))
FROM #TestTable
GROUP BY Column1, Column2
);
In this post, which inspired me most, is a comment that NOT IN
might be slower. But that's for SQL server, and sometimes elegance is more important :) - I also think it all depends on good indexes.
Anyway, usually it is bad design, to have tables without a PK. You should at least add an "autoinc" ID, and if you do, you can use that ID instead of the ROWID()
function, which is a non-standard extension by Sybase (some others have it, too).
这篇关于当您没有唯一的密钥时,如何删除sybase中的重复行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!