消除具有反向关系且没有主键的元组 [英] Eliminate tuples with reverse relation and no primary Key

查看:31
本文介绍了消除具有反向关系且没有主键的元组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据以下问题插入所有关系.我已经得到了从 A 到 C 的所有关系(根据下面的问题).但至于我得到的,我也得到了C 和 A 的朋友"的记录.据我了解问题陈述没有重复的友谊",我必须将友谊作为 A 插入到 C,反之亦然.要么我理解错了问题,要么我无法得到想要的结果.
所以,当我尝试将我得到的所有值插入到表中时,我的结果是错误的.有些人的朋友数量超过期望值,值为 2.

I am trying to insert all the relations based on the question below. I have got as far, to get all the relations from A to C (as per the question below). But as for I get, I am also getting the records getting "C friends with A". As far as I understand the question statement "No duplicate friendships", I have to insert friendship as A to C and not vice-versa. Either I understand the problem wrong, or I can't get the desired result.
So, when I try to insert into the table all the values I've got, my result is wrong. Some persons have friends that are more than desired, by a value of 2.

表的结构如下:

Friend ( ID1, ID2 )

ID1 的学生与 ID2 的学生是朋友.友谊是相互的,所以如果 (123, 456) 在 Friend 表中,那么 (456, 123) 也是如此.

The student with ID1 is friends with the student with ID2. Friendship is mutual, so if (123, 456) is in the Friend table, so is (456, 123).

(无主键)我试图解决的情况是:

(No primary key) The situation I am trying to solve is:

对于所有 A 与 B 为朋友,B 与 C 为朋友的情况,为 A 和 C 对添加新的好友.不要添加重复的好友、已存在的好友或与自己的好友."

"For all cases where A is friends with B, and B is friends with C, add a new friendship for the pair A and C. Do not add duplicate friendships, friendships that already exist, or friendships with oneself."

我已经尝试解决这个问题 2 天了.请帮忙.

I have been trying to solve this problem for 2 days now. Please help.

提前致谢.

----我的 SQL 查询-----

----My SQL query-----

select B.ID1 as ID1,B.ID3 as ID2
from (select A.ID1 as ID1,A.ID2 as ID2,A.ID3 as ID3,F3.ID2 as ID4
from (select F1.ID1 as ID1,F1.ID2 as ID2,F2.ID2 as ID3
from Friend F1 join Friend F2
on F1.ID2=F2.ID1
where F1.ID1<>F2.ID2) A join Friend F3
on A.ID3=F3.ID1) B
where B.ID1<>B.ID4
group by  B.ID1,B.ID3

推荐答案

让我们假设(对于示例)朋友表包含这些行.

Let's assume (for the examples) that the friend table contains these rows.

ID1 ID2
--- ---
 a   b
 a   c
 b   a
 b   c
 b   d
 c   b

首先,从朋友表中识别完整朋友"元组开始,查询如下:

First, start with identifying the 'full friend' tuples from the friend table, with a query like this:

SELECT fa.ID1
     , fa.ID2
  FROM friend fa
  JOIN friend fb
    ON fb.ID1 = fa.ID2
   AND fb.ID2 = fa.ID1

fa.ID1 fa.ID2
------ ------
  a      b
  b      a
  b      c
  c      b

这个结果告诉我们a是b的朋友,b是c的朋友.(a,c)(b,d) 行被省略,因为没有逆,(c,a)>(d,b).

This result shows us that a is friends with b, and b is friends with c. The (a,c) and (b,d) rows are omitted because there is no inverse, (c,a) or (d,b).

暂时,我们将此集合称为ft"(朋友元组).现在我们可以针对该集合 (ft) 编写一个查询,以获取所有a->b->c"和c->b->a"朋友对.

For the time being, we'll refer to this set as "ft" (friend tuples). Now we can write a query against that set (ft), to get all the "a->b->c" and "c->b->a" friend pairs.

SELECT fx.ID1
     , fy.ID2
  FROM ft fx
  JOIN ft fy
    ON fy.ID1 = fx.ID2 
   AND fy.ID2 <> fx.ID1

fx.ID1 fy.ID2
------ ------
  a      c
  c      a

但是,我们需要确保我们不会复制友元表中已经存在的任何行,因此我们可以使用 NOT IN 或 NOT EXISTS 谓词,或者我们可以使用反连接模式来删除与朋友表中已有的行匹配的行.

But, we need to be sure that we don't duplicate any rows that are already in the friend table, so we could use a NOT IN or a NOT EXISTS predicate, or we can use an anti-join pattern, to eliminate rows that match a row already in the friend table.

SELECT fx.ID1
     , fy.ID2
  FROM ft fx
  JOIN ft fy
    ON fy.ID1 = fx.ID2 
   AND fy.ID2 <> fx.ID1
  -- eliminate rows that match
  LEFT
  JOIN friend fe
    ON fe.ID1 = fx.ID1
   AND fe.ID2 = fy.ID2
 WHERE fe.ID1 IS NULL

fx.ID1 fy.ID2
------ ------
  c      a

现在,我们可以用生成集合的查询(作为内联视图)替换对 ft 的引用:

Now, we can replace the references to ft with the query (as an inline view) that produces the set:

SELECT fx.ID1
     , fy.ID2
  FROM ( SELECT fa.ID1
              , fa.ID2
           FROM friend fa
           JOIN friend fb
             ON fb.ID1 = fa.ID2
            AND fb.ID2 = fa.ID1
       ) fx
  JOIN ( SELECT fc.ID1
              , fc.ID2
           FROM friend fc
           JOIN friend fd
             ON fd.ID1 = fc.ID2
            AND fd.ID2 = fc.ID1
       ) fy
    ON fy.ID1 = fx.ID2 
   AND fy.ID2 <> fx.ID1
  -- eliminate rows that match
  LEFT
  JOIN friend fe
    ON fe.ID1 = fx.ID1
   AND fe.ID2 = fy.ID2
 WHERE fe.ID1 IS NULL
 GROUP 
    BY fx.ID1
     , fy.ID2

(我在想,只要我们保证 (ID1,ID2) 是唯一的,这个查询就不会生成任何重复项.我想这个查询只会生成指定的匹配项,而不是任何额外的匹配.一些额外的测试用例将是为了确认.如果查询确实产生了任何重复,然后添加一个 GROUP BY fx.ID1, fy.ID2 到查询将消除它们.)

(I'm thinking as long as we are guaranteed that (ID1,ID2) is unique, that this query won't generate any duplicates. And I'm thinking that this query will generate only the matches specified, and not any extra matches. Some additional test cases would be in order to confirm. If the query does produce any duplicates, then adding a GROUP BY fx.ID1, fy.ID2 to the query would eliminate them.)

最后,要将这些行放入友元表中,请在查询之前使用:

Finally, to put those rows into the friend table, precede the query with:

INSERT INTO friend (ID1,ID2)

<小时>

更新

我们想要返回的结果实际上取决于友谊"的表示方式.

The result we want returned really depends on how a "friendship" is represented.

我假设 friend 表中存在两个元组来表示朋友"对:(a,b) 和 (b,a)必须存在.(当a朋友b"和b朋友a"形成友谊).

I was assuming that "friend" pair was represented in the friend table by the existence of two tuples: both (a,b) and (b,a) have to exist. (A friendship is formed when "a friends b", and "b friends a).

如果只有一排,就不是真正的友谊,只是半途而废.

If only one of the rows exists, it's not a real friendship, only a halfway friendship.

我运行了几个测试用例.通过它们工作有点乏味.我通过添加 ORDER BY 以确定性顺序返回行并在 SELECT 列表中添加其他列来扩展查询,以验证路径"(共享朋友).我注释掉了 WHERE 子句,所以我可以看到所有潜在的朋友.

I ran several test cases. It's kind of tedious working through them. I expanded the query by adding an ORDER BY to get the rows back in a deterministic order, and adding additional columns in the SELECT list, to verify the "path" (shared friend). I commented out the WHERE clause, so I could see all the potential friends.

我确实发现我需要添加一个 GROUP BY 来消除重复项.我们可以从两个或多个共享朋友中推导出 a-c 友谊,例如br.a-b + b-ca-r + r-c 的结果都是 a-c.

I did find that I needed to add a GROUP BY to eliminate duplicates. We can derive the a-c friendship from two or more shared friends e.g. b and r. Both a-b + b-c and a-r + r-c result in a-c.

这是我测试的最后一个查询.除了增加了 GROUP BY 之外,它与前面的基本相同.

This is the final query I tested. It's essentially equivalent to the previous, except for the addition of the GROUP BY.

SELECT fx.ID1
     , fy.ID2
 --  , fx.ID1>fy.ID2 AS d
 --  , fx.ID1 AS x1
 --  , fx.ID2 As x2
 --  , fy.ID1 AS y1
 --  , fy.ID2 As y2
 --  , fe.ID1 AS e1
 --  , fe.ID2 AS e2
  FROM ( SELECT fa.ID1
              , fa.ID2
              , fa.ID1>fa.ID2 AS d
           FROM friend fa
           JOIN friend fb
             ON fb.ID1 = fa.ID2
            AND fb.ID2 = fa.ID1
       -- ORDER
       --    BY LEAST(fa.ID1,fa.ID2)
       --     , GREATEST(fa.ID1,fa.ID2)
       --     , fa.ID1>fa.ID2
       ) fx
  JOIN ( SELECT fc.ID1
              , fc.ID2
           FROM friend fc
           JOIN friend fd
             ON fd.ID1 = fc.ID2
            AND fd.ID2 = fc.ID1
       -- ORDER
       --    BY LEAST(fc.ID1,fc.ID2)
       --     , GREATEST(fc.ID1,fc.ID2)
       --     , fc.ID1>fc.ID2
       ) fy
    ON fy.ID1 = fx.ID2 
   AND fy.ID2 <> fx.ID1
  -- eliminate rows that match existing row
  LEFT
  JOIN friend fe
    ON fe.ID1 = fx.ID1
   AND fe.ID2 = fy.ID2
 WHERE fe.ID1 IS NULL
 GROUP
    BY fx.ID1
     , fy.ID2
 ORDER
    BY LEAST(fx.ID1,fy.ID2)
     , GREATEST(fx.ID1,fy.ID2)
     , fx.ID1>fy.ID2

<小时>

如果仅存在一个元组(a,b)"表示(b,a)"表示完全友谊,则需要更改查询.


If a full friendship is represented by the existence of just one tuple "(a,b)" implies "(b,a)", then the query would need to be changed.

fxfy 的内联视图查询需要扩展以返回缺失"的逆元组...如果 (a,b) 在朋友表,我们的查询需要返回 (a,b) 和 (b,a).我们将通过在两个相同的查询之间执行 UNION ALL 操作来实现这一点,只需颠倒 SELECT 列表中列的顺序.(在这里,我们实际上可以使用 UNION 而不是 UNION ALL 来消除任何重复项.)fxfy 的内联视图查询将类似于:

The inline view query for fx and fy would need to be expanded to return the "missing" inverse tuples... if (a,b) is in the friend table, our query needs to return both (a,b) and (b,a). We'd accomplish that by doing a UNION ALL operation between two identical queries, with just the order of the columns in the SELECT list reversed. (Here, we could actually make use of UNION instead of UNION ALL to eliminate any duplicates.) The inline view query for fx and fy would be something like:

SELECT fa.ID1, fa.ID2 FROM ...
 UNION ALL
SELECT fa.ID2, fa.ID1 FROM ...

消除友元表中匹配行的检查也需要更改(如果我们发现存在的 (a,b) 和 (b,a),我们希望从结果集中消除 (a,b) 和 (b,a)) 或 (b,a) 行)

The check to eliminate matching rows in the friend table would also need to be changed (we'd want to eliminate both (a,b) and (b,a) from the resultset if we found an existing (a,b) or (b,a) row)

ON ( fe.ID1 = fx.ID1 AND fe.ID2 = fy.ID2 )
OR ( fe.ID1 = fy.ID2 AND fe.ID2 = fx.ID1 )

并且需要更改 SELECT 列表和 GROUP BY 以消除额外的"逆元组.我们可以使用像 ORDER BY

And the SELECT list and GROUP BY would need to be changed to eliminate the "extra" inverse tuple. We could use an expression like in the ORDER BY

SELECT LEAST(fx.ID1,fy.ID2) AS ID1
     , GREATEST(fx.ID1,fy.ID2) AS ID2
       ...
 GROUP
    BY LEAST(fx.ID1,fy.ID2)
     , GREATEST(fx.ID1,fy.ID2)

这篇关于消除具有反向关系且没有主键的元组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆