MySQL组by和max返回错误的行 [英] MySQL group by and max returns wrong rows

查看:94
本文介绍了MySQL组by和max返回错误的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  CREATE TABLE IF 

我有两张表,我试着找到每天得分最高的post。 NOT EXISTS`posts_points`(
`post_id` int(10)unsigned NOT NULL,
`comment` smallint(5)unsigned NOT NULL,
`likes` smallint(5)unsigned NOT NULL ,
`favorites` smallint(5)unsigned NOT NULL,
PRIMARY KEY(`post_id`)
)ENGINE = MyISAM DEFAULT CHARSET = utf8;

$ b CREATE TABLE如果不存在`posts`(
`profile_id` int(10)unsigned NOT NULL,
`post_id` int(10)unsigned NOT NULL ,
`pubdate_utc` datetime NOT NULL,
PRIMARY KEY(`post_id`),
KEY`profile_id`(`profile_id`),
)ENGINE = MyISAM DEFAULT CHARSET = utf8 ROW_FORMAT = DYNAMIC;

我试过下面的查询。它返回正确的分数,但其他列只是随机行。

  SELECT p.post_id,p.profile_id 
,MAX(t1.score)
,DATE_FORMAT(t1.pubdate_utc,'%d%b')post_date
,DATE(t1.pubdate_utc)mydate
FROM

SELECT p.profile_id,p .post_id,p.pubdate_utc
,(pp.comments + pp.likes + pp.favorites)AS score
FROM posts p
INNER JOIN posts_points pp ON p.post_id = pp.post_id
)t1
INNER JOIN帖子p ON t1.post_id = p.post_id
AND t1.pubdate_utc = p.pubdate_utc
GROUP BY mydate
ORDER BY mydate DESC
LIMIT 18;


解决方案

我一直遇到这个问题。当MySQL运行一个聚合函数时,对于任何非聚合列,它只需为该组提供首次运行的数据,无论它是否来自MAX行。所以你必须做的是在内部查询中排序数据,以便maxes在他们的组中排第一位。看看这是否适合你:

  SELECT t.post_id,
t.profile_id,
t。分数,
t.pubdate_utc
FROM(SELECT p.profile_id,
p.post_id,
p.pubdate_utc,
(pp.comments + pp.likes + pp .favorites)score
FROM posts p
JOIN posts_points pp ON p.post_id = pp.post_id
WHERE p.pubdate_utc> = DATE_ADD(DATE(NOW()),INTERVAL -17 DAY )
ORDER BY得分DESC
)t
GROUP BY DATE(t.pubdate_utc)DESC
;

请注意,我在这里不使用MAX函数。按分数降序排序,然后在外部查询中按日期分组将按日期提高最高分数。另外请注意,我将WHERE子句放在内部查询中。像这样的内部查询(有时是必要的)效率不高,因为它们没有外部查询优化的索引,所以请确保内部结果集尽可能小。最后,注意GROUP BY DATE(t.pubdate_utc)。如果我没有把它缩减到只有日期信息,那么将会有超过18个结果,因为时间也被计算在内。



编辑:更改为 INTERVAL -17 DAY 放弃18个结果而不是19个。


I have two tables and I try to find the "post" with the highest score per day.

CREATE TABLE IF NOT EXISTS `posts_points` (
  `post_id` int(10) unsigned NOT NULL,
  `comments` smallint(5) unsigned NOT NULL,
  `likes` smallint(5) unsigned NOT NULL,
  `favorites` smallint(5) unsigned NOT NULL,
   PRIMARY KEY (`post_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;


CREATE TABLE IF NOT EXISTS `posts` (
  `profile_id` int(10) unsigned NOT NULL,
  `post_id` int(10) unsigned NOT NULL,
  `pubdate_utc` datetime NOT NULL,
  PRIMARY KEY (`post_id`),
  KEY `profile_id` (`profile_id`),
) ENGINE=MyISAM DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC;

I have tried the query below. It returns the correct score but the other columns are just random rows. What am I doing wrong ?

SELECT p.post_id, p.profile_id
   , MAX(t1.score)
   , DATE_FORMAT(t1.pubdate_utc, '%d %b') post_date
   , DATE(t1.pubdate_utc) mydate
FROM
(
   SELECT p.profile_id, p.post_id, p.pubdate_utc
      , (pp.comments + pp.likes + pp.favorites) AS score
   FROM posts p 
   INNER JOIN posts_points pp ON p.post_id = pp.post_id
) t1
INNER JOIN posts p ON t1.post_id = p.post_id
   AND t1.pubdate_utc = p.pubdate_utc
GROUP BY mydate
ORDER BY mydate DESC
LIMIT 18;

解决方案

I run into this problem all the time. When MySQL runs an aggregate function, for any non-aggregated columns, it simply pulls the first data it runs across for that group, whether it is from the MAX row or not. So what you have to do is order the data in an inner query such that the maxes are first in their groups. See if this works for you:

SELECT t.post_id,
       t.profile_id,
       t.score,
       t.pubdate_utc
FROM (SELECT p.profile_id,
             p.post_id,
             p.pubdate_utc,
             (pp.comments + pp.likes + pp.favorites) score
      FROM posts p
      JOIN posts_points pp ON p.post_id = pp.post_id
      WHERE p.pubdate_utc >= DATE_ADD(DATE(NOW()), INTERVAL -17 DAY)
      ORDER BY score DESC
     ) t
GROUP BY DATE(t.pubdate_utc) DESC
;

Notice that I use no MAX function here. Ordering by score descending and then grouping by date in the outer query will pull up the highest score by date. Also notice that I put the WHERE clause in the inner query. Inner queries like this (tho sometimes necessary) are not very efficient, since they have no indexes for the outer query to optimize on, so make sure your inner result set is as small as it can be. Lastly, notice the GROUP BY DATE(t.pubdate_utc). If I did not reduce it down to just the date information, there would be a lot more than 18 results, as times are also counted then.

Edit: Changed to INTERVAL -17 DAY to give up to 18 results instead of 19.

这篇关于MySQL组by和max返回错误的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆