优化mysql计数查询 [英] optimize mysql count query
问题描述
是否有进一步优化此方法的方法,还是应该让我感到满意,所以需要花费9秒才能计数1100万行?
Is there a way to optimize this further or should I just be satisfied that it takes 9 seconds to count 11M rows ?
devuser@xcmst > mysql --user=user --password=pass -D marctoxctransformation -e "desc record_updates"
+--------------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+----------+------+-----+---------+-------+
| record_id | int(11) | YES | MUL | NULL | |
| date_updated | datetime | YES | MUL | NULL | |
+--------------+----------+------+-----+---------+-------+
devuser@xcmst > date; mysql --user=user --password=pass -D marctoxctransformation -e "select count(*) from record_updates where date_updated > '2009-10-11 15:33:22' "; date
Thu Dec 9 11:13:17 EST 2010
+----------+
| count(*) |
+----------+
| 11772117 |
+----------+
Thu Dec 9 11:13:26 EST 2010
devuser@xcmst > mysql --user=user --password=pass -D marctoxctransformation -e "explain select count(*) from record_updates where date_updated > '2009-10-11 15:33:22' "
+----+-------------+----------------+-------+--------------------------------------------------------+--------------------------------------------------------+---------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+-------+--------------------------------------------------------+--------------------------------------------------------+---------+------+----------+--------------------------+
| 1 | SIMPLE | record_updates | index | idx_marctoxctransformation_record_updates_date_updated | idx_marctoxctransformation_record_updates_date_updated | 9 | NULL | 11772117 | Using where; Using index |
+----+-------------+----------------+-------+--------------------------------------------------------+--------------------------------------------------------+---------+------+----------+--------------------------+
devuser@xcmst > mysql --user=user --password=pass -D marctoxctransformation -e "show keys from record_updates"
+----------------+------------+--------------------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+--------------------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
| record_updates | 1 | idx_marctoxctransformation_record_updates_date_updated | 1 | date_updated | A | 2416 | NULL | NULL | YES | BTREE | |
| record_updates | 1 | idx_marctoxctransformation_record_updates_record_id | 1 | record_id | A | 11772117 | NULL | NULL | YES | BTREE | |
+----------------+------------+--------------------------------------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
更新-我的解决方案在这里: http://code.google.com/p/xcmetadataservicestoolkit/wiki/ResumptionToken
UPDATE - my solution is here: http://code.google.com/p/xcmetadataservicestoolkit/wiki/ResumptionToken
推荐答案
如果mysql必须计算11M行,那么实际上没有太多方法可以加快简单计数的速度.至少不要让它达到1秒以下的速度.您应该重新考虑如何计算.一些想法:
If mysql has to count 11M rows, there really isn't much of a way to speed up a simple count. At least not to get it to a sub 1 second speed. You should rethink how you do your count. A few ideas:
-
向表中添加一个自动递增字段.看起来您不会从表中删除,因此您可以使用简单的数学来查找记录数.选择最初的较早日期的最小自动递增编号,然后选择较晚的日期的最大自动递增编号,然后从另一个中减去一个以获得记录计数.例如:
Add an auto increment field to the table. It looks you wouldn't delete from the table, so you can use simple math to find the record count. Select the min auto increment number for the initial earlier date and the max for the latter date and subtract one from the other to get the record count. For example:
SELECT min(incr_id) min_id FROM record_updates WHERE date_updated BETWEEN '2009-10-11 15:33:22' AND '2009-10-12 23:59:59';
SELECT max(incr_id) max_id FROM record_updates WHERE date_updated > DATE_SUB(NOW(), INTERVAL 2 DAY);`
创建另一个表,该表汇总每天的记录计数.然后,您可以查询该表以获取总记录.每年只有365条记录.如果需要更细粒度的时间,请在汇总表中查询整天,并在当前表中查询开始和结束日期的记录数.然后将它们全部加在一起.
Create another table summarizing the record count for each day. Then you can query that table for the total records. There would only be 365 records for each year. If you need to get down to more fine grained times, query the summary table for full days and the current table for just the record count for the start and end days. Then add them all together.
如果数据没有变化(看起来好像没有变化),则摘要表将易于维护和更新.他们将大大加快速度.
If the data isn't changing, which it doesn't seem like it is, then summary tables will be easy to maintain and update. They will significantly speed things up.
这篇关于优化mysql计数查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!