在LEFT JOIN中防止重复值 [英] Prevent duplicate values in LEFT JOIN
问题描述
我遇到了从LEFT JOIN
获得重复值的情况.我认为这可能是理想的行为,但与我想要的有所不同.
I faced a situation where I got duplicate values from LEFT JOIN
. I think this might be a desired behavior but unlike from what I want.
我有三个表: person
, department
和 contact
.
I have three tables: person
, department
and contact
.
人:
id bigint,
person_name character varying(255)
部门:
person_id bigint,
department_name character varying(255)
联系人:
person_id bigint,
phone_number character varying(255)
SQL查询:
SELECT p.id, p.person_name, d.department_name, c.phone_number
FROM person p
LEFT JOIN department d
ON p.id = d.person_id
LEFT JOIN contact c
ON p.id = c.person_id;
结果:
id|person_name|department_name|phone_number
--+-----------+---------------+------------
1 |"John" |"Finance" |"023451"
1 |"John" |"Finance" |"99478"
1 |"John" |"Finance" |"67890"
1 |"John" |"Marketing" |"023451"
1 |"John" |"Marketing" |"99478"
1 |"John" |"Marketing" |"67890"
2 |"Barbara" |"Finance" |""
3 |"Michelle" |"" |"005634"
我知道这就是联接的作用,不断与选定的行相乘.但这给人一种感觉,就像电话号码023451
,99478
,67890
都用于两个部门,而它们仅与具有不必要重复值的john人有关,这将放大更大的数据集.
所以,这就是我想要的:
I know it's what joins do, keeping multiplied with selected rows. But It gives a sense like phone numbers 023451
,99478
,67890
are for both departments while they are only related to person john with unnecessary repeated values which will escalate the problem with larger data set.
So, here is what I want:
id|person_name|department_name|phone_number
--+-----------+---------------+------------
1 |"John" |"Finance" |"023451"
1 |"John" |"Marketing" |"99478"
1 |"John" |"" |"67890"
2 |"Barbara" |"Finance" |""
3 |"Michelle" |"" |"005634"
这是我的情况的一个示例,并且我使用了大量的表和查询.因此,需要一种通用的解决方案.
This is a sample of my situation and I am using a large set of tables and queries. So, kind of need a generic solution.
推荐答案
我喜欢将此问题称为代理交叉加入" .由于没有信息(WHERE
或JOIN
条件)应该如何匹配表department
和contact
,因此它们通过代理表person
交叉连接-为您提供笛卡尔积.与此非常相似:
I like to call this problem "cross join by proxy". Since there is no information (WHERE
or JOIN
condition) how the tables department
and contact
are supposed to match up, they are cross-joined via the proxy table person
- giving you the Cartesian product. Very similar to this one:
更多说明.
您的查询解决方案:
SELECT p.id, p.person_name, d.department_name, c.phone_number
FROM person p
LEFT JOIN (
SELECT person_id, min(department_name) AS department_name
FROM department
GROUP BY person_id
) d ON d.person_id = p.id
LEFT JOIN (
SELECT person_id, min(phone_number) AS phone_number
FROM contact
GROUP BY person_id
) c ON c.person_id = p.id;
您没有定义选择哪个部门或电话号码,因此我随意选择了第一个.您可以通过其他任何方式使用它...
You did not define which department or phone number to pick, so I arbitrarily chose the first. You can have it any other way ...
这篇关于在LEFT JOIN中防止重复值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!