左连接逻辑不符合预期 [英] left join logic not as expected

查看:41
本文介绍了左连接逻辑不符合预期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个查询,我相信它应该返回表 1 中的所有电子邮件地址.

I have created a query that should, I believe, return all email addresses from table 1 regardless.

如果我去 SELECT COUNT(email), COUNT(DISTINCT email)contacts.sid208 我得到 200,000 和 175000.

If I go SELECT COUNT(email), COUNT(DISTINCT email) contacts.sid208 I get 200,000 and 175000.

考虑到这一点,通过使用左连接,来自以下查询结果的电子邮件计数应该相同吗?

With this in mind, by using left joins the count of email from the following query result should be the same no?

SELECT 
    COUNT(email), COUNT(DISTINCT email)
FROM
    (SELECT 
        co.email,
            env.env_medium,
            CAST(MIN(co.created) AS DATE) AS first_contact,
            MIN(CASE
                WHEN my.my_id = 581 THEN my.data
            END) AS Created,
            MIN(CASE
                WHEN my.my_id = 3347 THEN my.data
            END) AS Upgraded
    FROM
        contacts.sid208 co
    LEFT JOIN contacts.my208 my ON co.id = my.eid
    LEFT JOIN contacts.env208 env ON env.eid = co.id
    WHERE
        my_id = 581 OR my_id = 3347
    GROUP BY email) b1

但这里的结果,如果我保持比例,是 150000 和 150000.

But the results here, if I keep things proportionate, are 150000 and 150000.

我预计结果是 175000.

I expected the results to be 175000.

我对 LEFT JOIN 的理解是,所有来自contacts.sid208 的记录都会被保留,无论它们是否出现在 my208 或 env208 中.

My understanding of LEFT JOIN was that all records from contacts.sid208 would be maintained, regardless of whether or not they appear in my208 or env208.

我的理解有问题吗?希望我的查询对人们有意义,如果我可以添加更多信息以使我的问题更清楚,请告诉我.

Is my understanding flawed here? Hope my query makes sense to folk, if there's any more info I can add to make my question clearer let me know.

推荐答案

对于左联接,也将条件移至联接:

For a left join, move the conditions to the join as well:

SELECT 
    COUNT(email), COUNT(DISTINCT email)
FROM
    (SELECT 
        co.email,
            env.env_medium,
            CAST(MIN(co.created) AS DATE) AS first_contact,
            MIN(CASE
                WHEN my.my_id = 581 THEN my.data
            END) AS Created,
            MIN(CASE
                WHEN my.my_id = 3347 THEN my.data
            END) AS Upgraded
    FROM
        contacts.sid208 co
    LEFT JOIN contacts.my208 my 
        ON co.id = my.eid
        AND (my_id = 581 OR my_id = 3347)
    LEFT JOIN contacts.env208 env ON env.eid = co.id
    GROUP BY email) b1

如果您不这样做,您将首先执行连接,从而导致来自 sid208 的所有行,不管怎样,对于丢失的电子邮件,都带有 null 值.但是随后 where 子句中的过滤开始起作用,并且无论如何都会删除这些记录.

If you don't do so, you will first perform the join, resulting in all rows from sid208, regardless, with null values for missing emails. But then the filtering in the where clause kicks in and those records are removed anyway.

当您将所有这些条件移动到 join 时,您将获得所有行,并且只有当电子邮件具有匹配的联系人 id 并且它们自己的 id 为 581 或 2247 时,才会加入.

When you move all those conditions to the join, you get all rows, and the emails are only joined when they have the matching contact id, and their own id is either 581 or 2247.

这篇关于左连接逻辑不符合预期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆