MySql:多个左联接给出错误的输出 [英] MySql: Multiple Left Join giving wrong output

查看:89
本文介绍了MySql:多个左联接给出错误的输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在查询中使用多个左联接"时遇到了一些麻烦.一些表与左表具有一对一的关系,而另一些则具有一对多的关系.查询看起来像这样:

I'm having a little trouble about using multiple Left Joins in a query. Some of the tables have one-to-one relationship with the left-table and some have one-to-many relation. The query looks like this:

Select 
    files.filename,
    coalesce(count(distinct case
                when dm_data.weather like '%clear%' then 1
                    end),
            0) as clear,
    coalesce(count(distinct case
                when dm_data.weather like '%lightRain%' then 1
                    end),
            0) as lightRain,
    coalesce(count(case
                when kc_data.type like '%bicycle%' then 1
                    end),
            0) as bicycle,
    coalesce(count(case
                when kc_data.type like '%bus%' then 1
                    end),
            0) as bus,
    coalesce(count(case
                when kpo_data.movement like '%walking%' then 1
                    end),
            0) as walking,
    coalesce(count(case
                when kpo_data.type like '%pedestrian%' then 1
                    end),
            0) as pedestrian
from
    files
        left join
    dm_data ON dm_data.id = files.id
        left join
    kc_data ON kc_data.id = files.id
        left join
    kpo_data ON kpo_data.id = files.id
where
    files.filename in (X, Y, Z, ........)
group by files.filename;

在这里,dm_data表与文件"表具有一对一的关系(这就是为什么我使用"Distinct"),而kc_data和kpo_data数据与文件"表具有一对多的关系. (针对一个files.id,kc_data和kpo_data可以有10到20行).此查询工作正常.

Here, dm_data table has a one-to-one relation with 'files' table (thats why I'm using 'Distinct'), whereas kc_data and kpo_data data has one-to-many relationship with the 'files' table. (kc_data and kpo_data can have 10 to 20 rows against one files.id). This query works fine.

当我添加另一个左联接与另一个一对多表pd_markings(针对一个files.id可以有100行)时,就会出现问题.

The problem arises when I add another left join with another one-to-many table pd_markings (which can have 100s of rows against one files.id).

Select 
    files.filename,
    coalesce(count(distinct case
                when dm_data.weather like '%clear%' then 1
                    end),
            0) as clear,
    coalesce(count(distinct case
                when dm_data.weather like '%lightRain%' then 1
                    end),
            0) as lightRain,
    coalesce(count(case
                when kc_data.type like '%bicycle%' then 1
                    end),
            0) as bicycle,
    coalesce(count(case
                when kc_data.type like '%bus%' then 1
                    end),
            0) as bus,
    coalesce(count(case
                when kpo_data.movement like '%walking%' then 1
                    end),
            0) as walking,
    coalesce(count(case
                when kpo_data.type like '%pedestrian%' then 1
                    end),
            0) as pedestrian,
    **coalesce(count(case
                when pd_markings.movement like '%walking%' then 1
                    end),
            0) as walking**
from
    files
        left join
    dm_data ON dm_data.id = files.id
        left join
    kc_data ON kc_data.id = files.id
        left join
    kpo_data ON kpo_data.id = files.id
        left join
    **kpo_data ON pd_markings.id = files.id**
where
    files.filename in (X, Y, Z, ........)
group by files.filename;

现在,所有值都变成彼此的倍数.有任何想法吗???

Now all the values become multiple of each other. Any ideas???

请注意,前两列返回1或0值.那实际上是期望的结果,因为一对一关系表对任何files.id仅具有1或0行,因此,如果我不使用'Distinct',则结果值是错误的(我猜是因为其他表格针对同一文件返回的行数要多于一行.id)不,很不幸,我的表除了文件"表外没有自己的唯一ID列.

Note that the first two columns return 1 or 0 value. Thats the desired result actually, as one-to-one relationship tables will only have either 1 or 0 rows against any files.id, so if I don't use 'Distinct' then the resulting value is wrong (i guess because of the other tables which are returning more then one row against same file.id) No, unfortunately, my tables don't have their own unique ID columns except the 'files' table.

推荐答案

您需要

You need to flatten the results of your query, in order to obtain a right count.

您说您的文件表与其他表具有一对多的关系

You said you have one-to-many relationship from your files table to other table(s)

如果SQL仅具有关键字LOOKUP而不是填充JOIN关键字中的所有内容,则很容易推断出表A和表B之间的关系是一对一的,使用JOIN将自动表示一对多.我离题了.无论如何,我应该已经推断出您的文件是针对dm_data的一对多文件.而且,针对kc_data的文件也是一对多的. LEFT JOIN是另一个暗示,第一张表和第二张表之间的关系是一对多的.尽管这不是确定的,但一些编码人员只是用LEFT JOIN编写所有内容.查询中的LEFT JOIN没什么问题,但是如果查询中有多个一对多表,那肯定会失败,查询将产生与其他行重复的行.

If SQL only has a keyword LOOKUP instead of cramming everything in JOIN keywords, it shall be easy to infer if the relation between table A and table B is one-to-one, using JOIN will automatically connotes one-to-many. I digress. Anyway, I should have already inferred that your files is one-to-many against dm_data; and also, the files against kc_data is one-to-many too. LEFT JOIN is another hint that the relationship between first table and second table is one-to-many; this is not definitive though, some coders just write everything with LEFT JOIN. There's nothing wrong with your LEFT JOIN in your query, but if there are multiple one-to-many tables in your query, that will surely fail, your query will produce repeating rows against other rows.

from
    files
        left join
    dm_data ON dm_data.id = files.id
        left join
    kc_data ON kc_data.id = files.id

因此,基于此知识,您表明文件与dm_data是一对多的,并且文件与kc_data也是一对多的.我们可以得出结论,将这些联接链接起来并将它们分组到一个整体查询中是有问题的.

So with this knowledge that you indicate files is one-to-many against dm_data, and it is one-to-many also against kc_data. We can conclude that there's something wrong with chaining those joins and grouping them on one monolithic query.

如果您有三个表,例如app(files),ios_app(dm_data),android_app(kc_data),则此示例为ios的数据:

An example if you have three tables, namely app(files), ios_app(dm_data), android_app(kc_data), and this is the data for example for ios:

test=# select * from ios_app order by app_code, date_released;
 ios_app_id | app_code | date_released | price  
------------+----------+---------------+--------
          1 | AB       | 2010-01-01    | 1.0000
          3 | AB       | 2010-01-03    | 3.0000
          4 | AB       | 2010-01-04    | 4.0000
          2 | TR       | 2010-01-02    | 2.0000
          5 | TR       | 2010-01-05    | 5.0000
(5 rows)

这是您的android的数据:

And this is the data for your android:

test=# select * from android_app order by app_code, date_released;
.android_app_id | app_code | date_released |  price  
----------------+----------+---------------+---------
              1 | AB       | 2010-01-06    |  6.0000
              2 | AB       | 2010-01-07    |  7.0000
              7 | MK       | 2010-01-07    |  7.0000
              3 | TR       | 2010-01-08    |  8.0000
              4 | TR       | 2010-01-09    |  9.0000
              5 | TR       | 2010-01-10    | 10.0000
              6 | TR       | 2010-01-11    | 11.0000
(7 rows)    

如果您仅使用此查询:

select x.app_code, 
    count(i.date_released) as ios_release_count, 
    count(a.date_released) as android_release_count
from app x
left join ios_app i on i.app_code = x.app_code
left join android_app a on a.app_code = x.app_code
group by x.app_code
order by x.app_code

相反,输出将是错误的:

The output will be wrong instead:

 app_code | ios_release_count | android_release_count 
----------+-------------------+-----------------------
 AB       |                 6 |                     6
 MK       |                 0 |                     1
 PM       |                 0 |                     0
 TR       |                 8 |                     8
(4 rows)

您可以将链接联接视为笛卡尔积,因此,如果第一个表上有3行,而第二个表上有2行,则输出为6

You can think of chained joins as cartesian product, so if you have 3 rows on first table, and has 2 rows on second table, the output will be 6

这里是可视化,看到每个ios AB都有2个重复的android AB.有3个ios AB,那么当您执行COUNT(ios_app.date_released)时,计数是多少?那将变成6;与COUNT(android_app.date_released)相同,也将为6.同样,每个ios TR都有4个重复的Android TR,ios中有2个TR,因此我们得到的计数为8.

Here's the visualization, see that there is 2 repeating android AB for every ios AB. There are 3 ios AB, so what would be the count when you do COUNT(ios_app.date_released)? That will become 6; the same with COUNT(android_app.date_released), this will also be 6. Likewise there's 4 repeating android TR for every ios TR, there are are 2 TR in ios, so that would give us a count of 8.

.app_code | ios_release_date | android_release_date 
----------+------------------+----------------------
 AB       | 2010-01-01       | 2010-01-06
 AB       | 2010-01-01       | 2010-01-07
 AB       | 2010-01-03       | 2010-01-06
 AB       | 2010-01-03       | 2010-01-07
 AB       | 2010-01-04       | 2010-01-06
 AB       | 2010-01-04       | 2010-01-07
 MK       |                  | 2010-01-07
 PM       |                  | 
 TR       | 2010-01-02       | 2010-01-08
 TR       | 2010-01-02       | 2010-01-09
 TR       | 2010-01-02       | 2010-01-10
 TR       | 2010-01-02       | 2010-01-11
 TR       | 2010-01-05       | 2010-01-08
 TR       | 2010-01-05       | 2010-01-09
 TR       | 2010-01-05       | 2010-01-10
 TR       | 2010-01-05       | 2010-01-11
(16 rows)

因此,您应该先将每个结果展平,然后再将它们连接到其他表和查询.

So what you should do is flatten each result before you join them to other tables and queries.

如果您的数据库具有CTE功能,请使用.它非常整洁且非常自成体系:

If your database is capable of CTE, please use so. It's very neat and very self-documenting:

with ios_app_release_count_list as
(
 select app_code, count(date_released) as ios_release_count
 from ios_app
 group by app_code
)
,android_release_count_list as
(
 select app_code, count(date_released) as android_release_count 
 from android_app 
 group by app_code  
)
select
 x.app_code, 
 coalesce(i.ios_release_count,0) as ios_release_count, 
 coalesce(a.android_release_count,0) as android_release_count
from app x
left join ios_app_release_count_list i on i.app_code = x.app_code
left join android_release_count_list a on a.app_code = x.app_code
order by x.app_code;

如果您的数据库尚不具备CTE功能(如MySQL),则应改为:

Whereas if your database has no CTE capability yet, like MySQL, you should do this instead:

select x.app_code, 
 coalesce(i.ios_release_count,0) as ios_release_count, 
 coalesce(a.android_release_count,0) as android_release_count
from app x
left join
(
 select app_code, count(date_released) as ios_release_count
 from ios_app
 group by app_code
) i on i.app_code = x.app_code
left join
(
 select app_code, count(date_released) as android_release_count 
 from android_app 
 group by app_code   
) a on a.app_code = x.app_code
order by x.app_code

该查询和CTE样式查询将显示正确的输出:

That query and the CTE-style query will show the correct output:

 app_code | ios_release_count | android_release_count 
----------+-------------------+-----------------------
 AB       |                 3 |                     2
 MK       |                 0 |                     1
 PM       |                 0 |                     0
 TR       |                 2 |                     4
(4 rows)

实时测试

错误查询: http://www.sqlfiddle.com/#!2/9774a/2

正确的查询: http://www.sqlfiddle.com/#!2/9774a/1

这篇关于MySql:多个左联接给出错误的输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆