删除重复的行 [英] Deleting duplicates of rows

查看:77
本文介绍了删除重复的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好我是msql的新手,无论如何我已经编制了一个包含搜索结果的表格,我想要做的是删除数据库中的任何重复行。


我实际上不确定如何实现这一点,我认为perphaps执行搜索并结合一些检查方法和删除语句。


是否有perpsps一个msql函数可以实现这一点??


还要注意,当我说复制副本时,它并不意味着所有的colums两行虽然相同但是会有足够的共同能够判断是否重复。


任何对此或参考的帮助都会很棒,谢谢

Hi im new to msql, anyway i have made up a table which contains a bunch of results from a search and what i want to do is to delete any duplicate rows in my database.

I not actually sure how to go about achieve this, i thought perphaps performing a search combined with some check method and a delete statement.

Is there perphaps a msql function that can achieve this??

Also note that when i say duplicate copies, it does not necesarilly mean that all the colums two rows while be the same but will have a enough in common to be able to tell if duplicate.

Any help with this or a reference would be great, thanks

推荐答案

基本上它听起来像如果您有要删除的重复行,则表示结构错误。将数据放入数据库的程序或数据库本身,通过其结构,应该在它们被放入数据库表之前过滤掉您认为重复的行。


查看关于INSERT INTO ..... ON DUPLICATE KEY UPDATE的mysql文档(例如在线)。句法。这将做你想要的。


例如,这个查询语句:


INSERT INTO mytable(id,xpos,ypos,string1, string2,string3)

值(25,2,5,''hello'',''world'',''来自我'')

ON DUPLICATE KEY UPDATE string1 = VALUES(string1);


这只是一个例子,肯定不能满足你的需要。但看看它的作用。在这种情况下,让我们说我已经设置了我的表mytable。拥有一个由三列id,xpos和ypos组成的唯一键。


我们假设我尝试在数据库中插入一组值,其中id,xpos和ypos的组合尚未包含在数据库中。在这种情况下,将执行查询的插入部分,并将该行插入数据库。


现在让我们说我尝试插入一组值,其中id,xpos和ypos的组合已经存在于表中。由于ON DUPLICATE KEY,子句,mysql不会返回错误,说我试图插入重复键,而是在ON DUPLICATE KEY关键字后执行语句。在上面的例子中,我要求使用insert语句中的值更新存储在表1中的值,但是要保留string2和string3的当前值。您可以根据需要更新任何其他值或不更新任何其他值。


我在这里想说的是,您应该改为查询和数据库结构,而不是尝试查找删除您认为重复的行的查询。添加唯一键)以便这些重复行甚至不进入数据库。
Basically it sounds like you have the wrong table structure if you have duplicate rows that you wish to remove. Either your program that puts data into the database, or the database itself, through its structure, should filter out the rows that you consider to be duplicate before they even get put into the database tables.

Take a look at the mysql documentation (e.g. online) about the "INSERT INTO ..... ON DUPLICATE KEY UPDATE" syntax. This will do what you want.

For example, this query statement:

INSERT INTO mytable (id,xpos,ypos,string1,string2,string3)
values (25, 2, 5, ''hello'',''world'',''from me'')
ON DUPLICATE KEY UPDATE string1 = VALUES(string1);

This is only an example, and surely does not do what you need. But take a look what it does. In this case, let us say that I have set up my table "mytable" to have a unique key composed of the three columns id, xpos and ypos.

Let us say I try to insert into the database a set of values for which the combination of id, xpos and ypos are not yet in the database. In this case, the insert part of the query will be carried out, and the row will be inserted into the database.

Now let us say I try to insert a set of values for which the combination of id, xpos and ypos already are present in the table. Because of the "ON DUPLICATE KEY" clause, mysql will not return an error saying I am trying to insert a duplicate key, but instead will carry out the statement after the ON DUPLICATE KEY keywords. In my case above, I ask to update the value stored in the table for string1 with the value in my insert statement, but to leave the current values of string2 and string3 alone. You can update any or none of the other values, as you need.

What I am trying to say here is, instead of trying to find a query to remove the rows you consider to be duplicate, you should instead change your queries and your database structure (by adding unique keys) so that these duplicate rows do not even get into the database.


首先感谢回复,其次这听起来像我可能需要做的,我也有一些如果需要在行之间进行区分,另外六个左右的行包含仅与其他列相关的仲裁数据,但是在行的副本的情况下可能会有所不同,因此大约有五个独特的colums大约五个。


即前五个列可用于查找副本但其余部分确实有所不同。

意味着我在前五个列中检查了重复,但我需要输入其他数据也可能不一样。您可能会问为什么重复之间存在差异?因为我从不同的来源检索相同的信息,以便源相互相同。


接下来我有点理解你使用的代码示例,如果我错了,请纠正我,但似乎当存在重复而不是实际上根本没有添加行时,特定的键/列名称已被更新。


我要求的是可以不更新某个键/ colum但是只是跳过这一行。


注意我使用php脚本将数据输入数据库。


感谢您的帮助,
First thanks for replying, secondly this sounds like what i probably need to do, also i do have a few unique colums approximatley about five of them if needed to differenatate between rows and another six or so rows that contain arbitry data that only relates to the other colums but can vary in the case of a duplicate of of a row.

i.e. The first five colums can be used to find a duplicate but the rest does vary.
meaning that i have check for duplicate in the first five colums, but i need to enter data also in the rest that may not be the same. You may ask why are there differences between duplicates? its because im retrieve the same information from different sources so that the sources are identical to each other.

Next i kinda understand the code example your using, correct me if i wrong but it seems that a specific key/colum name is been update when there is a duplicate rather than actually not adding the row at all.

What im asking is it possible to not update a certain key/colum but rather just skip over the row.

Note im using a php script to input data into the database.

Thanks for help,


我想我刚刚意识到复制键意味着它会跳过下一个条目而只是更新某个列。


这样可以正常工作但是它可以确定一个重复的密钥,10个列中只有5个是相同的吗?


感谢您的帮助
I think i just realised that the duplicate key means that it will skip putting in the next entry and just update a certain colum or not.

This will work fine but can it determine a duplicate key with only five of 10 colums been identical?

Thanks for help


这篇关于删除重复的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆