MySQL:查询之间的最佳索引 [英] MySQL: Optimal index for between queries
问题描述
我有一个具有以下结构的表:
I have a table with the following structure:
CREATE TABLE `geo_ip` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`start_ip` int(10) unsigned NOT NULL,
`end_ip` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `start_ip` (`start_ip`),
KEY `end_ip` (`end_ip`),
KEY `start_end` (`start_ip`,`end_ip`),
KEY `end_start` (`end_ip`,`start_ip`)) ENGINE=InnoDB;
MySQL似乎无法在我的大多数查询中使用索引,因为 where
子句在之间使用,介于
: start_ip
和<$ c $之间c> end_ip
MySQL seems to be unable to use the indexes for most of my queries, as the where
clause uses a between
that falls somewhere between start_ip
and end_ip
:
select * from geo_ip where 2393196360 between start_ip and end_ip;
+----+-------------+--------+------+-------------------------------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+-------------------------------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | geo_ip | ALL | start_ip,end_ip,start_end,end_start | NULL | NULL | NULL | 2291578 | Using where |
+----+-------------+--------+------+-------------------------------------+------+---------+------+---------+-------------+
该表有几百万条记录。我尝试通过删除 start_ip
和 end_ip
列来扩展表,并为每个可能的值<创建一行 start_ip
和 end_ip
作为 id
,然后查询 ID
。虽然这大大提高了查询性能,但它导致表大小从不到1千兆字节增长到数十千兆字节(表中显然还有其他列)。
The table has a few million records. I tried expanding the table, by removing the start_ip
and end_ip
columns, and creating a row for every possible value of start_ip
and end_ip
as the id
, then querying the id
. While that vastly improved query performance, it resulted in the table size growing from less than a gigabyte to tens of gigabytes (the table has other columns obviously).
还有什么可以是为了提高查询性能?我可以以某种方式更改查询,还是可以不同地索引列以导致命中?或许我还没有想到的东西?
What else can be done to improve query performance? Can I change the query somehow, or can I index the columns differently to result in a hit? Or perhaps something I haven't thought of yet?
编辑:
奇怪的是,索引用于某些值。例如:
Strangely, the index is used for certain values. For example:
explain select * from geo_ip where 3673747503 between start_ip and end_ip;
+----+-------------+--------+-------+-------------------------------------+--------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+-------------------------------------+--------+---------+------+-------+-------------+
| 1 | SIMPLE | geo_ip | range | start_ip,end_ip,start_end,end_start | end_ip | 4 | NULL | 19134 | Using where |
+----+-------------+--------+-------+-------------------------------------+--------+---------+------+-------+-------------+
推荐答案
不确定原因,但是添加order by子句和限制查询似乎总是会导致索引命中,并在几毫秒内执行而不是几个秒。
Not sure why, but adding an order by clause and limit to the query seems to always result in an index hit, and executes in a few milliseconds instead of a few seconds.
explain select * from geo_ip where 2393196360 between start_ip and end_ip order by start_ip desc limit 1;
+----+-------------+--------+-------+-----------------+----------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+-----------------+----------+---------+------+--------+-------------+
| 1 | SIMPLE | geo_ip | range | start_ip,end_ip | start_ip | 4 | NULL | 975222 | Using where |
+----+-------------+--------+-------+-----------------+----------+---------+------+--------+-------------+
这对我来说已经足够了,不过我愿意很想知道优化器为什么决定不在另一种情况下使用索引的原因。
This is good enough for me now, although I would love to know the reasoning behind why the optimizer decides not to use the index in the other case.
这篇关于MySQL:查询之间的最佳索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!