使用GROUP BY时,MYSQL显示不正确的行 [英] MYSQL shows incorrect rows when using GROUP BY

查看:436
本文介绍了使用GROUP BY时,MYSQL显示不正确的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两张表:

  article('id','ticket_id','incoming_time','to', 'from','message')
ticket('id','queue_id')

门票代表支持人员与客户之间的电子邮件线索,文章是组成线程的单个消息。



我正在寻找带有最高传入时间(表示为unix时间戳),用于每个ticket_id,这是我目前使用的查询:

  SELECT文章。*,MAX(article.incoming_time)作为maxtime $ b $ FROM票据,文章
WHERE ticket.id = article.ticket_id
和ticket.queue_id = 1
GROUP BY文章。 ticket_id

例如,

 :article:
id --- ticket_id --- incoming_time --- to ------- from ------- message ------- -
11 1 1234567 help @ client @我需要帮助。 ..
12 1 1235433 client @ help @我们如何提供帮助?
13 1 1240321 help @ client @想要食物!
...

:ticket:
id --- queue_id
1 1
...

但结果看起来是最小的文章ID的行,而不是我正在寻找的是具有最高传入时间的文章。



任何建议都将不胜感激!

这是一个大多数MySQL程序员遇到的经典障碍。


  • 您有一列 ticket_id ,它是 GROUP BY

  • 您有一列 incoming_time ,它是 MAX的参数()。此列中每个组中的行的最大值将作为 MAX()的值返回。

  • 表文章的列。 为这些列返回的值是任意的,而不是从发生 MAX()值的同一行。



数据库无法推断出您想要发生最大值的同一行的值。



考虑一下以下情况:


  • 有多行出现相同的最大值。哪一行应该用来显示文章的列。* ?你写一个查询返回 MIN()

  • MAX()。这是合法的,但哪一行应该是 article。* show?

      SELECT article。*,MIN(article.incoming_time),MAX(article.incoming_time)
    FROM ticket,art​​icle
    WHERE ticket.id = article.ticket_id
    and ticket.queue_id = 1
    GROUP BY article.ticket_id


  • 您使用一个集合函数,如 AVG() SUM(),其中没有行具有该值。数据库如何猜测显示哪一行?

      SELECT article。*,AVG(article.incoming_time)
    FROM ticket,art​​icle
    WHERE ticket.id = article.ticket_id
    AND ticket.queue_id = 1
    GROUP BY article.ticket_id


    在大多数品牌的数据库中 - 以及SQL标准本身 - 你都不允许由于含糊不清而写这样的查询。您不能在选择列表中包含任何不在聚合函数内或在 GROUP BY 子句中命名的列。



    MySQL更宽容。它可以让你做到这一点,并留给你写没有歧义的查询。如果你确实有歧义,它会从组中首先选择的行中选择值(但这取决于存储引擎)。



    SQLite也有这种行为,但它选择组中的 last 行来解决歧义。去搞清楚。如果SQL标准没有说明要做什么,这取决于供应商的实现。



    以下查询可以为您解决问题:

      SELECT a1。*,a1.incoming_time AS maxtime 
    FROM ticket t JOIN文章a1 ON(t.id = a1.ticket_id)
    LEFT OUTER JOIN文章a2 ON(t.id = a2.ticket_id
    AND a1.incoming_time< a2.incoming_time)
    WHERE t.queue_id = 1
    AND a2.ticket_id IS空值;

    换句话说,找一行( a1 a2 )没有其他行( ticket_id )和更大的 incoming_time 。如果未找到 incoming_time ,则LEFT OUTER JOIN将返回NULL而不是匹配。


    I have two tables:

    article('id', 'ticket_id', 'incoming_time', 'to', 'from', 'message')
    ticket('id', 'queue_id')
    

    where tickets represent a thread of emails between support staff and customers, and articles are the individual messages that compose a thread.

    I'm looking to find the article with the highest incoming time (expressed as a unix timestamp) for each ticket_id, and this is the query I'm currently using:

    SELECT article.* , MAX(article.incoming_time) as maxtime
    FROM ticket, article
    WHERE ticket.id = article.ticket_id
    AND ticket.queue_id = 1
    GROUP BY article.ticket_id
    

    For example,

    :article:
    id --- ticket_id --- incoming_time --- to ------- from ------- message --------
    11     1             1234567           help@      client@      I need help...   
    12     1             1235433           client@    help@        How can we help?
    13     1             1240321           help@      client@      Want food!    
    ...
    
    :ticket:
    id --- queue_id
    1      1
    ...
    

    But the result looks to be the row with the smallest article id instead of what I'm looking for which is the article with the highest incoming time.

    Any advice would be greatly appreciated!

    解决方案

    This is a classic hurdle that most MySQL programmers bump into.

    • You have a column ticket_id that is the argument to GROUP BY. Distinct values in this column define the groups.
    • You have a column incoming_time that is the argument to MAX(). The greatest value in this column over the rows in each group is returned as the value of MAX().
    • You have all other columns of table article. The values returned for these columns are arbitrary, not from the same row where the MAX() value occurs.

    The database cannot infer that you want values from the same row where the max value occurs.

    Think about the following cases:

    • There are multiple rows where the same max value occurs. Which row should be used to show the columns of article.*?

    • You write a query that returns both the MIN() and the MAX(). This is legal, but which row should article.* show?

      SELECT article.* , MIN(article.incoming_time), MAX(article.incoming_time)
      FROM ticket, article
      WHERE ticket.id = article.ticket_id
      AND ticket.queue_id = 1
      GROUP BY article.ticket_id
      

    • You use an aggregate function such as AVG() or SUM(), where no row has that value. How is the database to guess which row to display?

      SELECT article.* , AVG(article.incoming_time)
      FROM ticket, article
      WHERE ticket.id = article.ticket_id
      AND ticket.queue_id = 1
      GROUP BY article.ticket_id
      

    In most brands of database -- as well as the SQL standard itself -- you aren't allowed to write a query like this, because of the ambiguity. You can't include any column in the select-list that isn't inside an aggregate function or named in the GROUP BY clause.

    MySQL is more permissive. It lets you do this, and leaves it up to you to write queries without ambiguity. If you do have ambiguity, it selects values from the row that is physically first in the group (but this is up to the storage engine).

    For what it's worth, SQLite also has this behavior, but it chooses the last row in the group to resolve the ambiguity. Go figure. If the SQL standard doesn't say what to do, it's up to the vendor implementation.

    Here's a query that can solve your problem for you:

    SELECT a1.* , a1.incoming_time AS maxtime
    FROM ticket t JOIN article a1 ON (t.id = a1.ticket_id)
    LEFT OUTER JOIN article a2 ON (t.id = a2.ticket_id 
      AND a1.incoming_time < a2.incoming_time)
    WHERE t.queue_id = 1
      AND a2.ticket_id IS NULL;
    

    In other words, look for a row (a1) for which there is no other row (a2) with the same ticket_id and a greater incoming_time. If no greater incoming_time is found, the LEFT OUTER JOIN returns NULL instead of a match.

    这篇关于使用GROUP BY时,MYSQL显示不正确的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆