如何使用常量优化 MySQL 的查询? [英] How do I optimize MySQL's queries with constants?
问题描述
注意:最初的问题没有实际意义,但请扫到底部寻找相关内容.
我有一个我想优化的查询,它看起来像这样:
select cols from tbl where col = "some run time value" limit 1;
我想知道正在使用哪些键,但无论我通过什么来解释,它都能够将 where 子句优化为无(不可能的 WHERE 注意到..."),因为我给了它一个常量.
- 有没有办法告诉 mysql 在解释中不要做不断的优化?
- 我错过了什么吗?
- 有没有更好的方法来获取我需要的信息?
EXPLAIN
似乎给了我由常量值产生的查询计划.由于查询是存储过程的一部分(并且 spocs 中的 IIRC 查询计划在它们被调用之前生成)这对我没有好处,因为值不是恒定的.我想要的是找出优化器在不知道实际值时会生成什么查询计划.
我错过了什么吗?
Edit2:在别处询问,似乎 MySQL 总是重新生成查询计划,除非您不遗余力地使其重用它们.即使在存储过程中.由此看来,我的问题没有实际意义.
然而,这并没有使我真正想知道的事情变得没有实际意义: 如何优化包含在任何特定查询中保持不变的值的查询,但我,程序员,事先不知道将使用什么值? -- 例如,假设我的客户端代码正在生成一个查询,其中 where
子句中有一个数字.有时,该数字会导致不可能的 where 子句,有时则不会.我如何使用解释来检查查询的优化程度?
我马上看到的最好方法是在它上面运行 EXPLAIN
以获取存在/不存在案例的完整矩阵.这确实不是一个很好的解决方案,因为手工操作既困难又容易出错.
例如,假设我的客户端代码正在生成一个查询,在它的 where 子句中包含一个数字.
有时数字会导致不可能的 where 子句,有时则不会.
我如何使用解释来检查查询的优化程度?
MySQL
针对不同的绑定参数值构建不同的查询计划.
在这篇文章中,您可以阅读MySQL
优化器何时执行的操作列表:
此列表中还缺少一件事.
MySQL
可以在每次JOIN
迭代上重建查询计划:这种称为对每条记录的范围检查
.
如果你在一个表上有一个复合索引:
CREATE INDEX ix_table2_col1_col2 ON table2 (col1, col2)
和这样的查询:
SELECT *从表 1 t1JOIN table2 t2ON t2.col1 = t1.value1AND t2.col2 BETWEEN t1.value2_lowerbound AND t2.value2_upperbound
, MySQL
不会使用索引 RANGE
访问从 (t1.value1, t1.value2_lowerbound)
到 (t1.value1, t1.value2_upperbound)
.相反,它将在 (t1.value)
上使用索引 REF
访问,并过滤掉错误的值.
但是如果你像这样重写查询:
SELECT *从表 1 t1JOIN table2 t2ON t2.col1 <= t1.value1AND t2.col1 >= t2.value1AND t2.col2 BETWEEN t1.value2_lowerbound AND t2.value2_upperbound
,则MySQL
会重新检查table1
中每条记录的索引RANGE
访问,并决定是否使用RANGE
即时访问.
您可以在我博客中的这些文章中了解它:
- 选择时间戳zone - 如何使用粗过滤过滤掉没有时区的时间戳
- 模拟跳过扫描- 如何在
MySQL
中模拟 - 分析函数:优化 LAG、LEAD、FIRST_VALUE、LAST_VALUE - 如何在
MySQL
中模拟 Oracle 的分析功能 - 高级行采样- 如何从
MySQL
中的每组中选择
SKIP SCAN
访问方法N
条记录所有这些都采用范围检查每个记录
回到您的问题:无法判断 MySQL
将为每个给定常量使用哪个计划,因为在给定常量之前没有计划.
不幸的是,没有办法强制 MySQL
对绑定参数的每个值使用一个查询计划.
您可以使用STRAIGHT_JOIN
和FORCE INDEX
子句来控制JOIN
顺序和INDEX
的选择,但它们不会在索引上强制使用某个访问路径或禁止 IMPOSSIBLE WHERE
.
另一方面,对于所有JOIN
,MySQL
只使用NESTED LOOPS
.这意味着,如果您构建正确的 JOIN
顺序或选择正确的索引,MySQL
可能会从所有 IMPOSSIBLE WHERE
中受益.
NOTE: the original question is moot but scan to the bottom for something relevant.
I have a query I want to optimize that looks something like this:
select cols from tbl where col = "some run time value" limit 1;
I want to know what keys are being used but whatever I pass to explain, it is able to optimize the where clause to nothing ("Impossible WHERE noticed...") because I fed it a constant.
- Is there a way to tell mysql to not do constant optimizations in explain?
- Am I missing something?
- Is there a better way to get the info I need?
Edit: EXPLAIN
seems to be giving me the query plan that will result from constant values. As the query is part of a stored procedure (and IIRC query plans in spocs are generated before they are called) this does me no good because the value are not constant. What I want is to find out what query plan the optimizer will generate when it doesn't known what the actual value will be.
Am I missing soemthing?
Edit2: Asking around elsewhere, it seems that MySQL always regenerates query plans unless you go out of your way to make it re-use them. Even in stored procedures. From this it would seem that my question is moot.
However that doesn't make what I really wanted to know moot: How do you optimize a query that contains values that are constant within any specific query but where I, the programmer, don't known in advance what value will be used? -- For example say my client side code is generating a query with a number in it's where
clause. Some times the number will result in an impossible where clause other times it won't. How can I use explain to examine how well optimized the query is?
The best approach I'm seeing right off the bat would be to run EXPLAIN
on it for the full matrix of exist/non-exist cases. Really that isn't a very good solution as it would be both hard and error prone to do by hand.
For example say my client side code is generating a query with a number in it's where clause.
Some times the number will result in an impossible where clause other times it won't.
How can I use explain to examine how well optimized the query is?
MySQL
builds different query plans for different values of bound parameters.
In this article you can read the list of when does the MySQL
optimizer does what:
Action When Query parse PREPARE Negation elimination PREPARE Subquery re-writes PREPARE Nested JOIN simplification First EXECUTE OUTER->INNER JOIN conversions First EXECUTE Partition pruning Every EXECUTE COUNT/MIN/MAX elimination Every EXECUTE Constant subexpression removal Every EXECUTE Equality propagation Every EXECUTE Constant table detection Every EXECUTE ref access analysis Every EXECUTE range/index_merge analysis and optimization Every EXECUTE Join optimization Every EXECUTE
There is one more thing missing in this list.
MySQL
can rebuild a query plan on every JOIN
iteration: a such called range checking for each record
.
If you have a composite index on a table:
CREATE INDEX ix_table2_col1_col2 ON table2 (col1, col2)
and a query like this:
SELECT *
FROM table1 t1
JOIN table2 t2
ON t2.col1 = t1.value1
AND t2.col2 BETWEEN t1.value2_lowerbound AND t2.value2_upperbound
, MySQL
will NOT use an index RANGE
access from (t1.value1, t1.value2_lowerbound)
to (t1.value1, t1.value2_upperbound)
. Instead, it will use an index REF
access on (t1.value)
and just filter out the wrong values.
But if you rewrite the query like this:
SELECT *
FROM table1 t1
JOIN table2 t2
ON t2.col1 <= t1.value1
AND t2.col1 >= t2.value1
AND t2.col2 BETWEEN t1.value2_lowerbound AND t2.value2_upperbound
, then MySQL
will recheck index RANGE
access for each record from table1
, and decide whether to use RANGE
access on the fly.
You can read about it in these articles in my blog:
- Selecting timestamps for a time zone - how to use coarse filtering to filter out timestamps without a timezone
- Emulating SKIP SCAN - how to emulate
SKIP SCAN
access method inMySQL
- Analytic functions: optimizing LAG, LEAD, FIRST_VALUE, LAST_VALUE - how to emulate Oracle's analytic functions in
MySQL
- Advanced row sampling - how to select
N
records from each group inMySQL
All these things employ RANGE CHECKING FOR EACH RECORD
Returning to your question: there is no way to tell which plan will MySQL
use for every given constant, since there is no plan before the constant is given.
Unfortunately, there is no way to force MySQL
to use one query plan for every value of a bound parameter.
You can control the JOIN
order and INDEX
'es being chosen by using STRAIGHT_JOIN
and FORCE INDEX
clauses, but they will not force a certain access path on an index or forbid the IMPOSSIBLE WHERE
.
On the other hand, for all JOIN
's, MySQL
employs only NESTED LOOPS
. That means that if you build right JOIN
order or choose right indexes, MySQL
will probably benefit from all IMPOSSIBLE WHERE
's.
这篇关于如何使用常量优化 MySQL 的查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!