使用SUM,日期范围和分组依据优化MySQL查询 [英] Optimizing MySQL query with SUM, date range and Group By

查看:102
本文介绍了使用SUM,日期范围和分组依据优化MySQL查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下表格:

CREATE TABLE IF NOT EXISTS stats (
    date date NOT NULL DEFAULT '0000-00-00',
    cid int(8) NOT NULL DEFAULT '0',
    v bigint(15) NOT NULL DEFAULT '0',
    c bigint(15) NOT NULL DEFAULT '0',
    a bigint(15) NOT NULL DEFAULT '0',
PRIMARY KEY (date,cid),
KEY date (date),
KEY cid (cid),
KEY date_cid_vca (date,cid,v,c,a)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

此表有 30,842,712

CREATE TABLE IF NOT EXISTS camp (
id int(8) NOT NULL AUTO_INCREMENT,
name varchar(80) NOT NULL DEFAULT '',
PRIMARY KEY (id,name)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

此表具有 1985

我有以下查询:

SELECT
    c.id,
    c.name,
    SUM(s.v) AS sumv,
    SUM(s.c) AS sumc,
    GREATEST(((SUM(s.c)/SUM(s.v))*100.00), 0.00) AS cratio,
    SUM(s.a) AS suma,
    GREATEST(((SUM(s.a)/SUM(s.c))*100.00), 0.00) AS aratio
FROM
    stats s, camp c
WHERE
    s.date >= '2012-02-01' AND
    s.date <= '2012-02-29' AND
    c.id=s.cid
GROUP BY s.cid;

EXPLAIN显示:

+----+-------------+-------+-------+-------------------------------+--------------+---------+---------------------+---------+-----------------------------------------------------------+
| id | select_type | table | type  | possible_keys                 | key          | key_len | ref                 | rows    | Extra                                                     |
+----+-------------+-------+-------+-------------------------------+--------------+---------+---------------------+---------+-----------------------------------------------------------+
|  1 | SIMPLE      | s     | range | PRIMARY,date,cid,date_cid_vca | date_cid_vca | 3       | NULL                | 1010265 | Using where; Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | c     | ref   | PRIMARY                       | PRIMARY      | 4       | db.s.cid            |       1 | Using index                                               |
+----+-------------+-------+-------+-------------------------------+--------------+---------+---------------------+---------+-----------------------------------------------------------+

问题是,即使使用索引,查询仍需要大约50秒才能完成.还有其他方法可以优化查询吗?

Problem is that the query takes around 50 seconds to complete even though it uses indexes. Is there any other way to optimize the query?

谢谢!

推荐答案

在使用索引优化此查询方面,您做得很好.我想您在该日期范围内的stats中确实有超过一百万行.不幸的是,即使数据库具有覆盖索引,也要联接(然后分组)一百万行,这是数据库中要问的很多问题.为了获得更好的性能,您将需要增强硬件,开始进行非规范化(将camp放在stats内以避免连接),或者保持每个阵营的总计运行,而不是即时对其进行计算.

You've done well in optimizing this query with your indexes. I suppose you really do have over 1 million rows in stats in that date range. Unfortunately, joining (and then grouping) 1 million rows, even with a covering index, is a lot to ask from a database. For better performance, you will need to beef up the hardware, start denormalizing (put camp inside of stats to avoid the join), or keep running totals for each camp instead of calculating it on the fly.

修改

由于删除100万个以上的联接似乎产生了很大的影响,因此您可以尝试执行以下操作:

Since removing the 1 million+ joins seemed to have made a large impact, you can try something like this:

SELECT c.*, a.* FROM
(SELECT
    SUM(s.v) AS sumv,
    SUM(s.c) AS sumc,
    GREATEST(((SUM(s.c)/SUM(s.v))*100.00), 0.00) AS cratio,
    SUM(s.a) AS suma,
    GREATEST(((SUM(s.a)/SUM(s.c))*100.00), 0.00) AS aratio,
    s.cid
FROM
    stats s
WHERE
    s.date >= '2012-02-01'
   AND s.date <= '2012-02-29'
GROUP BY s.cid) a
JOIN
  camp c
  ON c.id = a.cid

此查询对较小的结果集进行联接.

This query does the join on the smaller result set.

这篇关于使用SUM,日期范围和分组依据优化MySQL查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆