联接的WHERE子句使执行时间增加4秒 [英] WHERE clause on join adding 4 seconds to execution time
问题描述
SELECT journey.id
FROM journey
JOIN journey_day ON journey_day.journey = journey.id
JOIN service ON journey.service = service.id
JOIN operator ON operator.id = service.operator
JOIN pattern ON pattern.id = journey.pattern
JOIN pattern_link pl ON pl.section = pattern.section AND pl.from_stop = "370023292"
JOIN pattern_link pl2 ON pl2.section = pl.section AND pl2.from_sequence < pl.from_sequence
WHERE CURDATE() BETWEEN service.date_start and service.date_end AND operator.id = "TMTL"
上面是一个SQL查询,平均需要0.1到0.3秒的时间来运行.
由于某种原因,我将AND journey_day.day = 3
附加到WHERE
子句后,会立即增加4秒钟的执行时间.出于这个原因,我要发布此问题.
journey_day
表中的每个列都有某种索引.
id(INT 11) | journey(VARCHAR 128) | day(TINYINT 1)
id
是主键,journey
和day
列均已索引.联接该表完全没有问题,但是WHERE
子句质疑该表后,执行时间就大大增加了.
journey_day
表仅用于确定journey
的运行日期.例如,如果某个旅程在星期三和星期四运行,则该旅程的标识符将在旅程列中,然后将3标识在另一行中,但在同一行中则带有4.
有什么想法会导致执行时间如此巨大的延迟吗?
编辑
在查询之前添加的EXPLAIN
的结果: http://i.imgur.com/QrjZEHy .png
由于您同时使用了journey_day.journey
和journey_day.day
列,因此多列索引比多个单列索引更有效.>
摘自《 MySQL参考手册》上的"MySQL如何使用索引" :
假设您发出以下SELECT语句:
mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
如果col1和col2上存在多列索引,则可以直接获取适当的行.如果col1和col2上存在单独的单列索引,则优化器将尝试使用索引合并优化(请参见第8.3.1.4节索引合并优化"),或尝试通过确定哪个索引发现较少限制来查找限制性最强的索引.行并使用该索引来获取行.*
因此,您可以考虑如下创建索引:
CREATE INDEX id_journey_day_id_day ON journey_day (journey, day);
在这种情况下,以下查询将使用索引:
SELECT journey.id
FROM journey
JOIN journey_day ON journey_day.journey = journey.id AND journey_day.day = 3
JOIN service ON journey.service = service.id
JOIN operator ON operator.id = service.operator
JOIN pattern ON pattern.id = journey.pattern
JOIN pattern_link pl ON pl.section = pattern.section AND pl.from_stop = "370023292"
JOIN pattern_link pl2 ON pl2.section = pl.section AND pl2.from_sequence < pl.from_sequence
WHERE CURDATE() BETWEEN service.date_start and service.date_end AND operator.id = "TMTL";
SELECT journey.id
FROM journey
JOIN journey_day ON journey_day.journey = journey.id
JOIN service ON journey.service = service.id
JOIN operator ON operator.id = service.operator
JOIN pattern ON pattern.id = journey.pattern
JOIN pattern_link pl ON pl.section = pattern.section AND pl.from_stop = "370023292"
JOIN pattern_link pl2 ON pl2.section = pl.section AND pl2.from_sequence < pl.from_sequence
WHERE CURDATE() BETWEEN service.date_start and service.date_end AND operator.id = "TMTL"
Above is an SQL query that takes on average 0.1 - 0.3 seconds to run.
For some reason, as soon as a I append AND journey_day.day = 3
to the WHERE
clause it adds an extra 4 seconds to the execution time. It is for this reason that I am posting this question.
Every column in the journey_day
table has indexing of some sort.
id(INT 11) | journey(VARCHAR 128) | day(TINYINT 1)
The id
is the primary key, the journey
and day
columns are both indexed. The table is joined with absolutely no problems, but as soon as the WHERE
clause questions the table, execution time goes up by far too much for my liking.
The journey_day
table is just there to determine which day the journey
runs on. For example, if a journey runs on Wednesday and Thursdays, the journey would have its identifier in the journey column and then 3, same thing in another row but with 4.
Any ideas why this huge delay in execution time is happening?
EDIT
Result of an EXPLAIN
prepended to the query: http://i.imgur.com/QrjZEHy.png
Since you are using both the journey_day.journey
and journey_day.day
columns, a multi-column index would be more efficient than multiple single-column indexes.
From "How MySQL Uses Indexes" on MySQL Reference Manual:
Suppose that you issue the following SELECT statement:
mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
If a multiple-column index exists on col1 and col2, the appropriate rows can be fetched directly. If separate single-column indexes exist on col1 and col2, the optimizer will attempt to use the Index Merge optimization (see Section 8.3.1.4, "Index Merge Optimization"), or attempt to find the most restrictive index by deciding which index finds fewer rows and using that index to fetch the rows.*
Thus, you could consider creating an index, as follows:
CREATE INDEX id_journey_day_id_day ON journey_day (journey, day);
in which case, the index would be used by the following query:
SELECT journey.id
FROM journey
JOIN journey_day ON journey_day.journey = journey.id AND journey_day.day = 3
JOIN service ON journey.service = service.id
JOIN operator ON operator.id = service.operator
JOIN pattern ON pattern.id = journey.pattern
JOIN pattern_link pl ON pl.section = pattern.section AND pl.from_stop = "370023292"
JOIN pattern_link pl2 ON pl2.section = pl.section AND pl2.from_sequence < pl.from_sequence
WHERE CURDATE() BETWEEN service.date_start and service.date_end AND operator.id = "TMTL";
这篇关于联接的WHERE子句使执行时间增加4秒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!