MYSQL:避免在自联接时重复记录的笛卡尔积 [英] MYSQL: Avoiding cartesian product of repeating records when self-joining

查看:640
本文介绍了MYSQL:避免在自联接时重复记录的笛卡尔积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有两个表:表A和表B.它们具有相同的列,并且数据实际上是相同的.它们都具有自动递增的ID,两者之间的唯一区别是对于相同的记录,它们具有不同的ID.

There are two tables: table A and table B. They have the same columns and the data is practically identical. They both have auto-incremented IDs, the only difference between the two is that they have different IDs for the same records.

在各列中,有一个IDENTIFIER列不是唯一的,即,在两个表中都有(很少)具有相同IDENTIFIER的记录.

Among the columns, there is an IDENTIFIER column which is not unique, i.e. there are (very few) records with the same IDENTIFIER in both tables.

现在,为了找到表A的ID和表B的ID之间的对应关系,我必须将这两个表连接起来(出于所有目的,这是一个自连接), IDENTIFIER列,类似于:

Now, in order to find a correspondence between the IDs of table A and the IDs of table B, I have to join these two tables (for all purposes it's a self-join) on the IDENTIFIER column, something like:

SELECT A.ID, B.ID
FROM A INNER JOIN B ON A.IDENTIFIER = B.IDENTIFIER

但是,由于IDENTIFIER是唯一的,因此会生成IDENTIFIER重复值的所有可能组合,我不希望如此.

理想情况下,我想根据其顺序在具有重复IDENTIFIER值的ID之间生成一对一关联.例如,假设在表A(因此在表B)中有六个具有不同ID和相同IDENTIFIER值的记录:

Ideally, I would like to generate an one to one association between IDs that have repeating IDENTIFIER values, based on their order. For example, supposing that there are six records with different ID and the same IDENTIFIER value in table A (and thus in table B):

A                                 B
IDENTIFIER:'ident105', ID:10  ->  IDENTIFIER:'ident105', ID:3
IDENTIFIER:'ident105', ID:20  ->  IDENTIFIER:'ident105', ID:400
IDENTIFIER:'ident105', ID:23  ->  IDENTIFIER:'ident105', ID:420
IDENTIFIER:'ident105', ID:100 ->  IDENTIFIER:'ident105', ID:512
IDENTIFIER:'ident105', ID:120 ->  IDENTIFIER:'ident105', ID:513
IDENTIFIER:'ident105', ID:300 ->  IDENTIFIER:'ident105', ID:798

那将是理想的. 无论如何,不​​管ID的顺序如何,都可以生成一对一的关联(但不是首选).

That would be ideal. Anyway, a way to generate a one to one association regardless of the order of the IDs would still be ok (but not preferred).

感谢您的时间,

西尔维奥

推荐答案

select a_numbered.id, a_numbered.identifier, b_numbered.id from 
(
select a.*,
       case 
          when @identifier = a.identifier then @rownum := @rownum + 1
          else @rownum := 1
       end as rn,
       @identifier := a.identifier
  from a
  join (select @rownum := 0, @identifier := null) r
order by a.identifier

) a_numbered join (
select b.*,
       case 
          when @identifier = b.identifier then @rownum := @rownum + 1
          else @rownum := 1
       end as rn,
       @identifier := b.identifier
  from b
  join (select @rownum := 0, @identifier := null) r
order by b.identifier

) b_numbered 
on a_numbered.rn=b_numbered.rn and a_numbered.identifier=b_numbered.identifier

这篇关于MYSQL:避免在自联接时重复记录的笛卡尔积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆