加快MySQL中的行计数 [英] Speeding up row counting in MySQL

查看:146
本文介绍了加快MySQL中的行计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设,为了说明的目的,你使用一个简单的MySQLbooks表来运行一个包含三个列的库:



(id,title,status) p>


  • id 是主键

  • >
  • 可以是描述图书当前状态(例如,AVAILABLE,CHECKEDOUT,PROCESSING,MISSING)的枚举 b $ b


一个简单的查询来报告每个州有多少图书:

  SELECT状态,COUNT(*)FROM books GROUP BY状态

查找有多少图书:

  SELECT COUNT(*)FROM books WHERE status =AVAILABLE

但是,一旦表增长到数百万行,这些查询需要几秒钟才能完成。在状态列中添加索引似乎不会对我的体验产生影响。



除了定期缓存结果或明确地更新摘要信息表每次一个图书改变状态(通过触发器或一些其他机制),有什么技术加快这些类型的查询吗?看来,COUNT个查询最终查找每一行,并且(不知道更多详细信息)我有点惊讶,这些信息不能以某种方式从索引确定。



UPDATE



使用带有200万行的索引的status列示例表查询。使用InnoDB存储引擎,查询在我的机器上需要3.0 - 3.2秒。使用MyISAM,查询需要0.9 - 1.1秒。在任一情况下,计数(*),计数(状态)或计数(1)之间没有显着差异。



MyISAM确实有点快,好奇地看到是否有一种方法可以使等效查询更快地运行(例如10-50 ms - 足够快地在低流量网站的每个网页请求中调用),而不需要心智缓存和触发器的开销。这听起来像是没有办法快速运行直接查询这是我的期望 - 我只是想确保我没有失去一个容易的替代。

解决方案

所以问题是


有什么技术可以加速这些类型的查询?


好吧,不是真的。基于列的存储引擎对于那些SELECT COUNT(*)查询可能更快,但对于几乎任何其他查询来说,性能都会降低。



通过触发器维护汇总表。它没有太多的开销,并且SELECT部分​​将是即时的,无论表有多大。这里有一些样板代码:

  DELIMITER // 

CREATE TRIGGER ai_books AFTER INSERT ON books
FOR EACH ROW UPDATE books_cnt SET total = total + 1 WHERE status = NEW.status
//
CREATE TRIGGER ad_books AFTER DELETE ON books
FOR EACH ROW UPDATE books_cnt SET total = total - 1 WHERE status = OLD.status;
//
CREATE TRIGGER au_books AFTER UPDATE ON books
FOR EACH ROW
BEGIN
IF(OLD.status<> NEW.status)
THEN
UPDATE books_cnt SET total = total + IF(status = NEW.status,1,-1)WHERE status IN(OLD.status,NEW.status);
END IF;
END
//


Suppose, for illustrative purposes, you are running a library using a simple MySQL "books" table with three columns:

(id, title, status)

  • id is the primary key
  • title is the title of the book
  • status could be an enum describing the book's current state (e.g. AVAILABLE, CHECKEDOUT, PROCESSING, MISSING)

A simple query to report how many books fall into each state is:

SELECT status, COUNT(*) FROM books GROUP BY status

or to specifically find how many books are available:

SELECT COUNT(*) FROM books WHERE status = "AVAILABLE"

However, once the table grows to millions of rows, these queries take several seconds to complete. Adding an index to the "status" column doesn't appear to make a difference in my experience.

Aside from periodically caching the results or explicitly updating summary info in a separate table each time a book changes state (via triggers or some other mechanism), are there any techniques for speeding up these kinds of queries? It seems that the COUNT queries end up looking at every row, and (without knowing more details) I'm a bit surprised that this information can't somehow be determined from the index.

UPDATE

Using the sample table (with an indexed "status" column) with 2 million rows, I benchmarked the GROUP BY query. Using the InnoDB storage engine, the query takes 3.0 - 3.2 seconds on my machine. Using MyISAM, the query takes 0.9 - 1.1 seconds. There was no significant difference between count(*), count(status), or count(1) in either case.

MyISAM is admittedly a bit faster, but I was curious to see if there was a way to make an equivalent query run much faster (e.g. 10-50 ms -- fast enough to be called on every webpage request for a low-traffic site) without the mental overhead of caching and triggers. It sounds like the answer is "there's no way to run the direct query quickly" which is what I expected - I just wanted to make sure I wasn't missing an easy alternative.

解决方案

So the question is

are there any techniques for speeding up these kinds of queries?

Well, not really. A column-based storage engine would probably be faster with those SELECT COUNT(*) queries but it would be less performant for pretty much any other query.

Your best bet is to maintain a summary table via triggers. It doesn't have much overhead and the SELECT part will be instantaneous no matter how big the table. Here's some boilerplate code:

DELIMITER //

CREATE TRIGGER ai_books AFTER INSERT ON books
FOR EACH ROW UPDATE books_cnt SET total = total + 1 WHERE status = NEW.status
//
CREATE TRIGGER ad_books AFTER DELETE ON books
FOR EACH ROW UPDATE books_cnt SET total = total - 1 WHERE status = OLD.status;
//
CREATE TRIGGER au_books AFTER UPDATE ON books
FOR EACH ROW
BEGIN
    IF (OLD.status <> NEW.status)
    THEN
        UPDATE books_cnt SET total = total + IF(status = NEW.status, 1, -1) WHERE status IN (OLD.status, NEW.status);
    END IF;
END
//

这篇关于加快MySQL中的行计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆