为什么索引会使查询真正变慢? [英] Why an index can make a query really slow?

查看:94
本文介绍了为什么索引会使查询真正变慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一天,我对SO回答了问题(被接受为正确),但是答案使我非常怀疑.
不久,用户有了一个带有以下字段的表:

Some day I answered a question on SO (accepted as correct), but the answer left me with a great doubt.
Shortly, user had a table with this fields:

id INT PRIMARY KEY
dt DATETIME (with an INDEX)
lt DOUBLE

查询SELECT DATE(dt),AVG(lt) FROM table GROUP BY DATE(dt)确实很慢. 我们告诉他,(部分)问题是使用 DATE(dt)作为字段和分组,但是db在生产服务器上并且无法拆分该字段.
因此,(使用触发器)插入了另一个字段da DATE (with an INDEX),该字段自动用 DATE(dt)填充.查询SELECT da,AVG(lt) FROM table GROUP BY da有点快,但是大约有8百万条记录,耗时约60秒! 我在电脑上尝试,最后发现,删除字段 da 查询的索引仅用了7s,而使用 DATE(dt)删除索引后只用了13s. br> 我一直认为用于分组的列索引确实可以加快查询速度,而不是相反(慢8倍!!).
为什么?是什么原因?
非常感谢.

The query SELECT DATE(dt),AVG(lt) FROM table GROUP BY DATE(dt) was really slow. We told him that (part of) the problem was using DATE(dt) as field and grouping, but db was on a production server and wasn't possible to split that field.
So (with a trigger) was inserted another field da DATE (with an INDEX) filled automatically with DATE(dt). Query SELECT da,AVG(lt) FROM table GROUP BY da was a bit faster, but with about 8mln records it took about 60s!!!
I tried on my pc and finally I discovered that, removing the index on field da query took only 7s, while using DATE(dt) after removing index it took 13s.
I've always thought an index on column used for grouping could really speed the query up, not the contrary (8 times slower!!!).
Why? Which is the reason?
Thanks a lot.

推荐答案

因为您仍然需要从索引+数据文件中读取所有数据.由于您没有使用任何where条件-您将始终拥有查询计划,该查询计划可以逐行访问所有数据,并且您将无法执行任何操作.

Because you still need to read all the data from both index + data file. Since you're not using any where condition - you always will have the query plan, that access all the data, row by row and you can do nothing with this.

如果性能对于此查询很重要并且经常执行-我建议将结果缓存到某个临时表中,并每小时(每天等)更新一次.

If performance is important for this query and it is performed often - I'd suggest to cache the results into some temporary table and update it hourly (daily, etc).

为什么变慢:因为在索引数据中已经进行了排序,并且当mysql计算查询执行成本时,它认为最好使用已排序的数据,然后将其分组,然后计算总计.但这不是这种情况.

Why it becomes slower: because in index data is already sorted and when mysql calculates cost of the query execution it thinks that it will be better to use already sorted data, then group it, then calculate agregates. But it is not in this case.

这篇关于为什么索引会使查询真正变慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆