在WHERE ABS(x-y)上使用的SQL INDEX< k条件,但用于y - k < x< y + k条件 [英] SQL INDEX not used on WHERE ABS(x-y) < k condition, but used on y - k < x < y + k condition
问题描述
我的查询涉及行数不超过2小时的行数(〜 0.08333天):
I have a query involving couples of rows which have a less-than-2-hours time-difference (~0.08333 days):
SELECT mt1.*, mt2.* FROM mytable mt1, mytable mt2
WHERE ABS(JULIANDAY(mt1.date) - JULIANDAY(mt2.date)) < 0.08333
此查询相当慢,即~1秒(表格有~10k行)。
This query is rather slow, i.e. ~ 1 second (the table has ~ 10k rows).
一个想法是使用 INDEX
。显然 CREATE INDEX id1 ON mytable(日期)
没有改善任何东西,这是正常的。
An idea was to use an INDEX
. Obviously CREATE INDEX id1 ON mytable(date)
didn't improve anything, that's normal.
然后我注意到神奇的查询 CREATE INDEX id2 ON mytable(JULIANDAY(date))
Then I noticed that the magical query CREATE INDEX id2 ON mytable(JULIANDAY(date))
-
在使用时没有帮助:
didn't help when using:
... WHERE ABS(JULIANDAY(mt1.date) - JULIANDAY(mt2.date)) < 0.08333
在使用时没有帮助:
didn't help when using:
... WHERE JULIANDAY(mt2.date) - 0.08333 < JULIANDAY(mt1.date) < JULIANDAY(mt2.date) + 0.08333
...但大幅改善了表现(查询快乐时间除以50!)使用时:
... but massively improved the performance (query time happily divided by 50 !) when using:
... WHERE JULIANDAY(mt1.date) < JULIANDAY(mt2.date) + 0.08333
AND JULIANDAY(mt1.date) > JULIANDAY(mt2.date) - 0.08333
当然1.,2。和3.从数学上来说是等价的,
Of course 1., 2. and 3. are equivalent since mathematically,
|x-y| < 0.08333 <=> y - 0.08333 < x < y + 0.08333
<=> x < y + 0.08333 AND x > y - 0.08333
问题:为什么解决方案1.和2.不使用INDEX而解决方案3.是否使用它?
注意:
-
我正在使用Python + Sqlite
sqlite3
模块
当执行 EXPLAIN QUERY PLAN SELECT ...
:
(0, 0, 0, u'SCAN TABLE mytable AS mt1')
(0, 1, 1, u'SCAN TABLE mytable AS mt2')
在执行 EXPLAIN QUERY PLAN SELECT ...
:
(0, 0, 1, u'SCAN TABLE mytable AS mt2')
(0, 1, 0, u'SEARCH TABLE mytable AS mt1 USING INDEX id2 (<expr>>? AND <expr><?)')
推荐答案
我相信包含 AND
的推理依据:
I believe that the inclusion of AND
is the reasoning as per :
查询的WHERE子句被分解为条款 其中每个术语
由AND运算符与其他术语分开。如果WHERE子句
由OR运算符分隔的约束组成,那么整个
子句被认为是应用OR子句
优化的单个term。
The WHERE clause on a query is broken up into "terms" where each term is separated from the others by an AND operator. If the WHERE clause is composed of constraints separate by the OR operator then the entire clause is considered to be a single "term" to which the OR-clause optimization is applied.
运行 ANALYZE
,看看是否有所改善。
It may be worthwhile running ANALYZE
to see if that improves matters.
根据评论:
我认为之前添加的段落可以澄清为什么ABS(xy)< k是
不使用索引,为什么x< y + k正在使用它,你不这么认为吗?
您想要包含此段吗? [分析WHERE
子句的所有术语,看看是否可以使用索引来满足它们。索引可以使用
,术语必须是以下形式之一:
column = expression,列IS表达式,列>表达式...
I think the previously added paragraph can clarify why ABS(x-y) < k is not using index, and why x < y + k is using it, don't you think so? Would you want to include this paragraph? [All terms of the WHERE clause are analyzed to see if they can be satisfied using indices. To be usable by an index a term must be of one of the following forms: column = expression, column IS expression, column > expression ...
已添加以下内容。
要使用索引,术语必须是以下形式之一:
column = expression
列IS表达式
列>表达式
列> =表达式
列< expression
column< = expression
expression = column
expression> column
expression> = column
表达< column
expression< = column
列IN(表达式列表)
列IN(子查询)
列IS NULL
To be usable by an index a term must be of one of the following forms:
column = expression
column IS expression
column > expression
column >= expression
column < expression
column <= expression
expression = column
expression > column
expression >= column
expression < column
expression <= column
column IN (expression-list)
column IN (subquery)
column IS NULL
我不确定它是否适用于 BETWEEN
(例如 WHERE列BETWEEN expr1 AND expr2
)。
I'm not sure if it would work with a BETWEEN
(e.g. WHERE column BETWEEN expr1 AND expr2
).
这篇关于在WHERE ABS(x-y)上使用的SQL INDEX< k条件,但用于y - k < x< y + k条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!