MySQL不总是使用索引 [英] MySQL does not always use index

查看:159
本文介绍了MySQL不总是使用索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

很简单的问题,但很难找到一个解决方案。
具有2,498,739行的地址表具有包括min_ip和max_ip字段的字段。



这个查询很简单。

  SELECT * 
FROM address a
WHERE min_ip< value
AND max_ip>值;

因此,为min_ip和max_ip创建索引以使查询更快是合乎逻辑的。 p>

为以下内容创建的索引。

  CREATE INDEX ip_range ON address ,max_ip)USING BTREE; 
CREATE INDEX min_ip ON地址(min_ip ASC)使用BTREE;
CREATE INDEX max_ip ON地址(max_ip DESC)使用BTREE;

我尝试创建第一个选项(min_ip和max_ip的组合),但它没有工作所以我准备了至少3个索引给MySQL更多的选择索引选择。 (注意这个表是静态的,更多的是查找表)

  + --------- --------------- + --------------------- + ------ + ----- + --------------------- + + --------------------------- -  + 
|字段|类型|空| Key |默认|额外|
+ ------------------------ + -------------------- - + ------ + ----- + --------------------- + ------------- ---------------- +
| id | bigint(20)unsigned | NO | PRI | NULL | auto_increment |
|网络| varchar(20)| YES | | NULL | |
| min_ip | int(11)unsigned | NO | MUL | NULL | |
| max_ip | int(11)unsigned | NO | MUL | NULL | |
+ ------------------------ + -------------------- - + ------ + ----- + --------------------- + ------------- ---------------- +

现在,可以直接使用min_ip和max_ip作为过滤条件查询表。

  EXPLAIN 
SELECT *
FROM address a
WHERE min_ip< 2410508496
AND max_ip> 2410508496;

查询执行大约0.120到0.200秒的时间。但是,在负载测试中,查询会快速降低性能。
MySQL服务器CPU使用率天空火箭到100%的CPU使用率只是几个同时的查询和性能迅速下降,不会放大。
在mysql服务器上缓慢查询已打开10秒或更长时间,最终select查询在几秒钟的负载测试后显示在日志中。
所以我用explain检查查询,发现它没有使用索引。



解释计划结果

  id select_type表类型possible_keys键key_len引用行额外
------ ----------- ---- - ------ ---------------------- ------ ------- ------ - ------ -------------
1 SIMPLE a ALL ip_range,min_ip,max_ip(NULL)(NULL)(NULL)2417789使用where

有趣的是,它能够将ip_range,ip_min和ip_max确定为潜在索引,但不要使用任何键列。
我知道我可以使用FORCE INDEX并尝试使用解释计划。

  EXPLAIN 
SELECT *
从地址a
FORCE INDEX(ip_range)
WHERE min_ip< 2410508496
AND max_ip> 2410508496;

使用FORCE INDEX结果解释计划

  id select_type表类型possible_keys键key_len参考行额外
------ ----------- ------ - ------------- -------- ------- ------ ------- ------- ----------------
1 SIMPLE一个范围ip_range ip_range 4(NULL)1208894使用索引条件

对于FORCE INDEX,是的,它使用ip_range索引作为键,行显示查询中不使用FORCE INDEX的子集,它是来自2,417,789的1,208,894。
所以肯定,使用索引应该有更好的性能。 (除非我误解了解释结果)



但是更有趣的是,经过几个测试,我发现在一些实例上,MySQL确实使用索引,强制指数。我的观察是当值很小,它使用索引。

  EXPLAIN 
SELECT *
FROM address a
WHERE min_ip< 508496
AND max_ip> 508496;

说明结果

 code> id select_type表类型possible_keys键key_len参考行额外
------ ----------- ------ ------ - -------------------- -------- ------- ------ ------ --- --------------------
1 SIMPLE一个范围ip_range,min_ip,max_ip ip_range 4(NULL)1使用索引条件

所以,它只是困惑我基于值传递给选择查询,MySQL决定什么时候使用索引和何时不使用索引。
我无法想象什么是确定什么时候使用传递给查询的某个值的索引的基础。我明白,
索引可能不使用,如果没有匹配的索引适合在WHERE条件,但在这种情况下,非常清楚ip_range索引其中
是基于min_ip和max_ip列的索引在这种情况下适合WHERE条件。



但是更大的问题是,其他查询。我必须去大规模地测试这些查询。
但是即使这样,随着数据的增长,我可以依赖并期望MySQL使用索引吗?
是的,我总是可以使用FORCE INDEX来确保它使用索引。但这不是标准的SQL,工作在所有的数据库。
ORM框架在生成SQL时可能无法支持FORCE INDEX语法,并且会将您的查询与您的索引名称紧密耦合。



不确定是否有人已经遇到这个问题,但这对我来说似乎是一个很大的问题。

解决方案

完全同意Vatev和其他人。不仅MySQL这样做。扫描表有时比首先查看索引然后在磁盘上查找相应的条目更便宜。



当它确实使用索引的时候,当它是一个覆盖索引时,这意味着,查询中的每一列(当然这个特定的表)存在于索引中。

  SELECT network 
FROM address a
WHERE min_ip < 2410508496
AND max_ip> 2410508496;

然后覆盖索引如

  CREATE INDEX ip_range ON地址(min_ip,max_ip,network)使用BTREE; 

只会查看索引,因为根本不需要在磁盘上查找其他数据。并且整个索引可以保存在内存中。


Very simple problem yet hard to find a solution. Address table with 2,498,739 rows has a field of min_ip and max_ip fields. These are the core anchors of the table for filtering.

The query is very simple.

SELECT * 
FROM address a 
WHERE min_ip < value
  AND max_ip > value;

So it is logical to create an index for the min_ip and max_ip to make the query faster.

Index created for the following.

CREATE INDEX ip_range ON address (min_ip, max_ip) USING BTREE;
CREATE INDEX min_ip ON address (min_ip ASC) USING BTREE;
CREATE INDEX max_ip ON address (max_ip DESC) USING BTREE;

I did try to create just the first option (combination of min_ip and max_ip) but it did not work so I prepared at least 3 indexes to give MySQL more options for index selection. (Note that this table is pretty much static and more of a lookup table)

+------------------------+---------------------+------+-----+---------------------+-----------------------------+
| Field                  | Type                | Null | Key | Default             | Extra                       |
+------------------------+---------------------+------+-----+---------------------+-----------------------------+
| id                     | bigint(20) unsigned | NO   | PRI | NULL                | auto_increment              |
| network                | varchar(20)         | YES  |     | NULL                |                             |
| min_ip                 | int(11) unsigned    | NO   | MUL | NULL                |                             |
| max_ip                 | int(11) unsigned    | NO   | MUL | NULL                |                             |
+------------------------+---------------------+------+-----+---------------------+-----------------------------+

Now, it should be straight forward to query the table with min_ip and max_ip as the filter criteria.

EXPLAIN
SELECT * 
FROM address a 
WHERE min_ip < 2410508496
  AND max_ip > 2410508496;

The query performed something around 0.120 to 0.200 secs. However, on a load test, the query rapidly degrades performance. MySQL server CPU usage sky rocket to 100% CPU usage on just a few simultaneous queries and performance degrades rapidly and does not scale up. Slow query on mysql server was turned on with 10 secs or higher, and eventually the select query shows up in the logs just after a few seconds of load test. So I checked the query with explain and found out that it did'nt use an index.

Explain plan result

    id  select_type  table   type    possible_keys           key     key_len  ref        rows  Extra        
------  -----------  ------  ------  ----------------------  ------  -------  ------  -------  -------------
     1  SIMPLE       a       ALL     ip_range,min_ip,max_ip  (NULL)  (NULL)   (NULL)  2417789  Using where  

Interestingly, it was able to determine ip_range, ip_min and ip_max as potential indexes but never use any of it as shown in the key column. I know I can use FORCE INDEX and tried to use explain plan on it.

EXPLAIN
SELECT * 
FROM address a 
FORCE INDEX (ip_range)
WHERE min_ip < 2410508496
  AND max_ip > 2410508496;

Explain plan with FORCE INDEX result

    id  select_type  table   type    possible_keys  key       key_len  ref        rows  Extra                  
------  -----------  ------  ------  -------------  --------  -------  ------  -------  -----------------------
     1  SIMPLE       a       range   ip_range       ip_range  4        (NULL)  1208894  Using index condition  

With FORCE INDEX, yes it uses the ip_range index as key, and rows shows a subset from the query that does not use FORCE INDEX which is 1,208,894 from 2,417,789. So definitely, using the index should have better performance. (Unless I misunderstood the explain result)

But what is more interesting is, after a couple of test, I found out that on some instances, MySQL does use index even without FORCE INDEX. And my observation is when the value is small, it does use the index.

EXPLAIN
SELECT * 
FROM address a 
WHERE min_ip < 508496
  AND max_ip > 508496;

Explain Result

    id  select_type  table   type    possible_keys           key       key_len  ref       rows  Extra                  
------  -----------  ------  ------  ----------------------  --------  -------  ------  ------  -----------------------
     1  SIMPLE       a       range   ip_range,min_ip,max_ip  ip_range  4        (NULL)       1  Using index condition  

So, it just puzzled me that base on the value pass to the select query, MySQL decides when to use an index and when not to use an index. I can't imagine what is the basis for determining when to use the index on a certain value being passed to the query. I do understand that index may not be used if there is no matching index suitable in the WHERE condition but in this case, it is very clear the ip_range index which is an index based on min_ip and max_ip column is suitable for the WHERE condition in this case.

But the bigger problem I have is, what about other queries. Do I have to go and test those queries on a grand scale. But even then, as the data grows, can I rely and expect MySQL to use the index? Yes, I can always use FORCE INDEX to ensure it uses the index. But this is not standard SQL that works on all database. ORM frameworks may not be able to support FORCE INDEX syntax when they generate the SQL and it tightly couples your query with your index names.

Not sure if anyone has ever encountered this issue but this seems to be a very big problem for me.

解决方案

Fully agree with Vatev and the others. Not only MySQL does that. Scanning the table is sometimes cheaper than looking at the index first then looking up corresponding entries on disk.

The only time when it for sure uses the index is, when it's a covering index, which means, that every column in the query (for this particular table of course) is present in the index. Meaning, if you need for example only the network column

SELECT network
FROM address a 
WHERE min_ip < 2410508496
  AND max_ip > 2410508496;

then a covering index like

CREATE INDEX ip_range ON address (min_ip, max_ip, network) USING BTREE;

would only look at the index as there's no need to lookup additional data on disk at all. And the whole index could be kept in memory.

这篇关于MySQL不总是使用索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆