MySQL不使用带有WHERE IN子句的索引? [英] MySQL not using indexes with WHERE IN clause?

查看:180
本文介绍了MySQL不使用带有WHERE IN子句的索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图优化一些数据库查询在我的Rails应用程序,我有几个,我得到了解决。他们都在WHERE子句中使用IN,并且都正在进行全表扫描,即使适当的索引似乎已经就位。

I'm trying to optimize some of the database queries in my Rails app and I have several that have got me stumped. They are all using an IN in the WHERE clause and are all doing full table scans even though an appropriate index appears to be in place.

例如:

SELECT `user_metrics`.* FROM `user_metrics` WHERE (`user_metrics`.user_id IN (N,N,N,N,N,N,N,N,N,N,N,N))

执行全表扫描和EXPLAIN说:

performs a full table scan and EXPLAIN says:

select_type: simple
type: all
extra: using where
possible_keys: index_user_metrics_on_user_id  (which is an index on the user_id column)
key: (none)
key_length: (none)
ref: (none)
rows: 208

使用IN语句时是否不使用索引,还是需要做不同的操作?这里的查询是由Rails生成的,所以我可以重新审视我的关系如何定义,但我想我会开始潜在的修复在数据库级别。

Are indexes not used when an IN statement is used or do I need to do something differently? The queries here are being generated by Rails so I could revisit how my relationships are defined, but I thought I'd start with potential fixes at the DB level first.

推荐答案

请参阅 MySQL如何使用索引

同时验证MySQL是否仍然执行 user_metrics >表。在小表中,索引访问实际上比表扫描更昂贵(I / O方面),MySQL的优化器可能会考虑这一点。

Also validate whether MySQL still performs a full table scan after you add an additional 2000-or-so rows to your user_metrics table. In small tables, access-by-index is actually more expensive (I/O-wise) than a table scan, and MySQL's optimizer might take this into account.

em>与我上一篇文章相反,结果证明MySQL也是使用基于成本的优化器,这是一个非常好的消息 - 也就是说,如果你相信数据库中的数据量是,那么你至少运行一次 ANALYZE 代表未来日常使用情况。

Contrary to my previous post, it turns out that MySQL is also using a cost-based optimizer, which is very good news - that is, provided you run your ANALYZE at least once when you believe that the volume of data in your database is representative of future day-to-day usage.

在处理基于成本的优化器(Oracle,Postgres等)时,您需要确保定期运行 ANALYZE 超过10-15%。 (默认情况下,Postgres将为您自动完成此操作,而其他RDBMS将为DBA(即您)保留此责任。)通过统计分析, ANALYZE 在各种候选执行计划之间进行选择时,将涉及更多的I / O(以及其它相关资源,例如需要用于排序所需的CPU)。未能运行 ANALYZE 可能会导致非常糟糕的,有时是灾难性的规划决策(例如毫秒查询,有时,因为坏的嵌套循环 JOIN s。)

When dealing with cost-based optimizers (Oracle, Postgres, etc.), you need to make sure to periodically run ANALYZE on your various tables as their size increases by more than 10-15%. (Postgres will do this automatically for you, by default, whereas other RDBMSs will leave this responsibility to a DBA, i.e. you.) Through statistical analysis, ANALYZE will help the optimizer get a better idea of how much I/O (and other associated resources, such as CPU, needed e.g. for sorting) will be involved when choosing between various candidate execution plans. Failure to run ANALYZE may result in very poor, sometimes disastrous planning decisions (e.g. millisecond-queries taking, sometimes, hours because of bad nested loops on JOINs.)

如果在运行 ANALYZE后,性能仍不能令人满意,那么你通常能够通过使用提示解决这个问题,例如 FORCE INDEX ,而在其他情况下,您可能会碰到MySQL错误(例如较老的一个,这可能咬你了你使用Rails的 nested_set )。

If performance is still unsatisfactory after running ANALYZE, then you will typically be able to work around the issue by using hints, e.g. FORCE INDEX, whereas in other cases you might have stumbled over a MySQL bug (e.g. this older one, which could have bitten you were you to use Rails' nested_set).

现在,因为你在Rails应用程序,这将是麻烦的(并且击败 ActiveRecord )发出您的自定义查询提示,而不是继续使用 ActiveRecord - 生成的。

Now, since you are in a Rails app, it will be cumbersome (and defeat the purpose of ActiveRecord) to issue your custom queries with hints instead of continuing to use the ActiveRecord-generated ones.

在我们的Rails应用程序中, SELECT 查询在切换到Postgres之后下降到100ms以下,而一些由 ActiveRecord 偶尔会花费多达15s或更多的MySQL 5.1,因为内部表扫描的嵌套循环,即使索引可用。没有优化器是完美的,你应该知道的选项。除了查询计划优化,还需要注意的其他潜在性能问题是锁定。这不在你的问题范围之内。

I had mentioned that in our Rails application all SELECT queries dropped below 100ms after switching to Postgres, whereas some of the complex joins generated by ActiveRecord would occasionally take as much as 15s or more with MySQL 5.1 because of nested loops with inner table scans, even when indices were available. No optimizer is perfect, and you should be aware of the options. Other potential performance issues to be aware of, besides query plan optimization, are locking. This is outside the scope of your problem though.

这篇关于MySQL不使用带有WHERE IN子句的索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆