如何避免在MySQL复合索引上使用order by语句进行前缀查询的文件排序? [英] How to avoid a filesort on a mysql composite index for a prefix query with an order by statement?
问题描述
我有一个带有三个列(德语双峰)的1 GB mysql表:
I have a 1 GB mysql table with three colums (german bigrams):
create table sortedindex (source varchar(60),target varchar(60),score float)
engine=myisam character set utf8 collate utf8_bin;
我还创建了一个复合索引:
I also created a composite index:
create index sortedstd_ix on sortedindex (source(60), target(60), score);
另外,我压缩了表并使其变为只读,并对索引进行了排序:
additionally I compressed the table and made it read only and sorted the index using:
myisamchk --keys-used=0 -rq sortedindex
myisampack sortedindex
myisamchk -rq sortedindex --sort_buffer=3G --sort-index --sort-records=1
现在我问具有以下结构的查询:
now I ask queries with the following structure:
- 修复源
- 指定目标的前缀
- 按比分检索前k行
类似于以下内容:
select * from sortedindex where source like "ein" and target like "interess%" order by score desc limit 5;
mysql解释告诉我,仍在使用文件排序!
mysql explain tells me that is still uses a filesort!
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by score desc limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index; Using filesort |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
1 row in set (0.00 sec)`
我了解,如果我将查询更改为:
I understand that if I change the query to:
explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score desc limit 5;
不会进行文件排序,但是会出错.
there will be no file sort but WRONG there is a filesort involved.
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score desc limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index; Using filesort |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
1 row in set (0.00 sec)
来自此讨论,我意识到desc关键字是问题所在.因此我们不检查:
from this discussion i realize the desc keyword is the problem. so we check without:
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
行之有效的
但是我想对分数而不是目标进行降序排序.以这种方式创建索引
create index sortedstd_ix on sortedindex (source(60), score desc, target(60));
不是一个选项,因为目标过滤器将产生文件排序,否则,如果前缀很长并且source是一个常用词,则需要遍历的元素的结果列表可能会很长.
is not an option since the target filter will yield for a file sort then or if not the result list of elements that need to be traversed can be really long if the prefix is long and source is a common word.
我有某种感觉没有对此的明显解决方案?
I somehow have the feeling there is no obvious solution to this?
推荐答案
您是对的.没有明显的解决方案.需要排序是因为您要求目标的多个值(例如"interess%").因此,索引不会为您提供按分数排序的行.
You are right. There is no obvious solution to this. Sorting is needed because you are asking for multiple values of target (like "interess%"). Hence, the index will not give you rows that are sorted on score.
这篇关于如何避免在MySQL复合索引上使用order by语句进行前缀查询的文件排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!