SQL查询忽略相邻的匹配行 [英] SQL Query to ignore adjacent matching rows

查看:84
本文介绍了SQL查询忽略相邻的匹配行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个活动表,其中包含各种活动和(actorEntity|subjectEntity|activity-type)

I have an activity table that contains various activities and a hash of (actorEntity|subjectEntity|activity-type)

在某些情况下,用户可能在几秒钟内执行两次完全相同的活动,这意味着数据库中将有两个相邻的行具有完全相同的数据.

There may be cases where a user performs the exact same activity twice in a matter of seconds, meaning there would be two adjacent rows in the database with exactly the same data.

对于某些用例,我们不想显示重复的活动,因此我们现在基于哈希在应用程序中过滤掉重复项.唯一的问题是,在某些情况下也会对这些活动进行分页,这意味着从应用程序的结果集中提取记录将与分页一起发生.

For some use cases, we do not want to display duplicate activities so we filter out duplicates in the application right now based on the hash. The only problem is that these activities are also paged in some cases, meaning that pulling out records from the result set in the application screws with the paging.

我正在寻找一种在SQL中执行此操作的方法,以便可以在数据库级别进行分页.

I am searching for a way to do this in SQL so that paging can be done at the DB level.

提供数据:

id  | message           | from  | hash
-------------------------------------
1   | hello             | bryan | b-hello
2   | goodbye           | bryan | b-goodbye
3   | goodbye           | john  | j-goodbye
4   | goodbye           | john  | j-goodbye 
5   | hello             | john  | j-hello   
6   | goodbye           | john  | j-goodbye

我想检索以下结果集:

id  | message           | from  | hash
-------------------------------------
1   | hello             | bryan | b-hello
2   | goodbye           | bryan | b-goodbye
4   | goodbye           | john  | j-goodbye
5   | hello             | john  | j-hello
6   | goodbye           | john  | j-goodbye

请注意,尽管第3行和第4行相同,但由于它们相邻,因此所需结果集中只有一个.即使第6行是相同的哈希,也应将其包括在结果中,因为它与另一个相同的哈希不相邻.

Notice that while rows 3 and 4 are identical, only one is in the desired result set because they were adjacent. Even though row 6 was the same hash, it should be included in the results since it was not adjacent to another identical hash.

我不在乎结果中返回哪些相邻行,因为它们相同.

I do not care which of the adjacent rows are returned in the results since they are the same.

我正在使用MySql 5.5.

I am using MySql 5.5.

推荐答案

查看以下内容:* SQLFIDDLE 我有点怀疑,对于您的要求来说这太简单了.所以请发表评论.我在样本数据中添加了更多重复记录.以下查询将删除最新的重复记录,并保留第一个条目.

Check this out: * SQLFIDDLE I am a little doubtful as if this was too simple for what you asked. So please comment. I added more duplicate records to sample data. The following query is removing the latest duplicates, keeping the first entry.

样本数据:

ID  MESSAGE     FROMA   HASHA
1   hello       bryan   b-hello
2   goodbye     bryan   b-goodbye
3   goodbye     john    j-goodbye
4   goodbye     john    j-goodbye
5   goodbye     john    j-goodbye
6   hello       john    j-hello
7   goodbye     bryan   b-goodbye
8   goodbye     bryan   b-goodbye

查询以查找相同的记录(最新的一个或多个):

Query to find identical record (the latest one or more):

select* from actors a
join actors b
where b.id + 1 = a.id
and b.hasha = a.hasha
;

结果:

ID  MESSAGE     FROMA   HASHA
4   goodbye     john    j-goodbye
5   goodbye     john    j-goodbye
8   goodbye     bryan   b-goodbye

查询以获取唯一记录:

select * from actors a
where a.id not in (
select a.id from actors a
join actors b
where b.id + 1 = a.id
and b.hasha = a.hasha
);

结果:

ID  MESSAGE     FROMA   HASHA
1   hello       bryan   b-hello
2   goodbye     bryan   b-goodbye
3   goodbye     john    j-goodbye
6   hello       john    j-hello
7   goodbye     bryan   b-goodbye

这篇关于SQL查询忽略相邻的匹配行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆