MySQL表索引优化 [英] MySQL table index optimization

查看:203
本文介绍了MySQL表索引优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用在Amazon RDS上具有MySQL数据库的应用程序.问题表的设置如下:

I'm working with an application that has a MySQL database at Amazon RDS. The table in questions is set up as such:

CREATE TABLE `log` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `timestamp` datetime NOT NULL,
  `username` varchar(45) NOT NULL,
  .. snip some varchar and int fields ..
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

该系统已经处于beta测试阶段,并且数据集已经非常庞大,查询开始变得相当缓慢.

This system has been in beta for a while and already the dataset is quite huge and the queries are starting to be rather slow.

SELECT COUNT(*) FROM log --> 16307224 (takes 105 seconds to complete)

此表几乎仅用于根据这样的查询构建一个报告

This table is pretty much only used to build one report off a query like this

SELECT timestamp, username, [a few more] FROM log 
WHERE timestamp  BETWEEN '2012-03-30 08:00:00' AND '2012-03-30 16:00:00' 
AND username='XX' 

通常会给出1000到6000行,大约需要100-180秒才能完成,这意味着Web应用程序通常会超时并留下空的报告(我也会查看超时,但是这个问题是找出根本原因).

Which typically will give something between 1000 and 6000 rows taking around 100-180 sec to complete, meaning the web application will often time out and leave an empty report (I will look in to the timeout as well, but this question is for the root cause).

我对数据库不是很好,但是我猜是在这之间杀了我的是BETWEEN.我在想的是,也许我应该以某种方式将时间戳记用作索引.时间戳和用户名一起仍应提供唯一性(我不使用id字段).

I'm not very good with databases, but my guess is that it's the BETWEEN that's killing me here. What I'm thinking is that I should perhaps somehow use the timestamp as index. Timestamp togethere with username should still provide uniqueness (I don't use the id field for anything).

如果有任何人提出优化建议,我将不知所措.

If there's anyone out there with suggestions for optimizations I'm all ears.

更新:

表格现在更改为以下内容

Table is now altered to the following

CREATE TABLE `log` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `timestamp` datetime NOT NULL,
  `username` varchar(45) NOT NULL,
  .. snip ..
  `task_id` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_un_ts` (`timestamp`,`username`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

SELECT语句的

EXPLAIN返回以下内容

EXPLAIN of the SELECT statement returns the following

id => 1
select_type => SIMPLE
table => log
type => range
possible_keys => index_un_ts
key => index_un_ts
key_len => 55
ref => 
rows => 52258
Extra => Using where; Using index

推荐答案

在timestamp列和userid上建立索引将很有帮助.您需要能够读取EXPLAIN语句的输出.

Well a index on the timestamp column and userid would be helpful. You need to be able to read the output of a EXPLAIN Statement.

转到MySQL并执行以下操作:

Go to MySQL and do the Following:

EXPLAIN SELECT timestamp, username, [a few more] FROM log 
WHERE timestamp  BETWEEN '2012-03-30 08:00:00' AND '2012-03-30 16:00:00' 
AND username='XX' 

这向您显示MySQL用于执行查询的计划.将有一个称为键的列.这表明MySQL在查询中使用的索引.我怀疑您会在那里看到所有内容,这意味着MySQL将从上至下扫描表,将每一行与where子句进行匹配.现在,在timestamp和userid列上创建一个索引.再次运行EXPLAIN语句.您应该在键列中看到您创建的索引.

This show you the plan MySQL uses to execute the query. There will be column called key. This indicates what index MySQL is using in the query. I suspect you will see ALL there which means MySQL is scanning the table from top to bottom matching every row against your where clause. Now create a index on the timestamp and userid columns. Run the EXPLAIN statement again. You should see the index that you created in the key column.

如果MySQL使用索引,那么您的查询应该会更快.只要记住不要过度索引.索引会使插入,更新和删除操作变慢.当您将新行插入表中并且该表上有三个索引时,新行必须将3个值写入三个不同的索引.所以这是一把双刃剑.

If MySQL uses the index then your query should be considerably faster. Just remember not to over index. Indexes make inserts, updates and deletes slower. When you insert a new row into a table and there is three indexes on the table the new row has to write 3 values to the three different indexes. So it is a double edged sword.

这篇关于MySQL表索引优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆