联接的WHERE子句使执行时间增加4秒 [英] WHERE clause on join adding 4 seconds to execution time

查看:88
本文介绍了联接的WHERE子句使执行时间增加4秒的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

SELECT journey.id
FROM journey
JOIN journey_day ON journey_day.journey = journey.id
JOIN service ON journey.service = service.id
JOIN operator ON operator.id = service.operator
JOIN pattern ON pattern.id = journey.pattern
JOIN pattern_link pl ON pl.section = pattern.section AND pl.from_stop = "370023292"
JOIN pattern_link pl2 ON pl2.section = pl.section AND pl2.from_sequence < pl.from_sequence
WHERE CURDATE() BETWEEN service.date_start and service.date_end AND operator.id = "TMTL"

上面是一个SQL查询,平均需要0.1到0.3秒的时间来运行.

由于某种原因,我将AND journey_day.day = 3附加到WHERE子句后,会立即增加4秒钟的执行时间.出于这个原因,我要发布此问题.

journey_day表中的每个列都有某种索引.

id(INT 11) | journey(VARCHAR 128) | day(TINYINT 1)

id是主键,journeyday列均已索引.联接该表完全没有问题,但是WHERE子句质疑该表后,执行时间就大大增加了.

journey_day表仅用于确定journey的运行日期.例如,如果某个旅程在星期三和星期四运行,则该旅程的标识符将在旅程列中,然后将3标识在另一行中,但在同一行中则带有4.

有什么想法会导致执行时间如此巨大的延迟吗?

编辑

在查询之前添加的EXPLAIN的结果: http://i.imgur.com/QrjZEHy .png

解决方案

由于您同时使用了journey_day.journeyjourney_day.day列,因此多列索引比多个单列索引更有效.

摘自《 MySQL参考手册》上的"MySQL如何使用索引" :

假设您发出以下SELECT语句:

mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;

如果col1和col2上存在多列索引,则可以直接获取适当的行.如果col1和col2上存在单独的单列索引,则优化器将尝试使用索引合并优化(请参见第8.3.1.4节索引合并优化"),或尝试通过确定哪个索引发现较少限制来查找限制性最强的索引.行并使用该索引来获取行.*

因此,您可以考虑如下创建索引:

CREATE INDEX id_journey_day_id_day ON journey_day (journey, day);

在这种情况下,以下查询将使用索引:

SELECT journey.id
FROM journey
JOIN journey_day ON journey_day.journey = journey.id AND journey_day.day = 3
JOIN service ON journey.service = service.id
JOIN operator ON operator.id = service.operator
JOIN pattern ON pattern.id = journey.pattern
JOIN pattern_link pl ON pl.section = pattern.section AND pl.from_stop = "370023292"
JOIN pattern_link pl2 ON pl2.section = pl.section AND pl2.from_sequence < pl.from_sequence
WHERE CURDATE() BETWEEN service.date_start and service.date_end AND operator.id = "TMTL";

SELECT journey.id
FROM journey
JOIN journey_day ON journey_day.journey = journey.id
JOIN service ON journey.service = service.id
JOIN operator ON operator.id = service.operator
JOIN pattern ON pattern.id = journey.pattern
JOIN pattern_link pl ON pl.section = pattern.section AND pl.from_stop = "370023292"
JOIN pattern_link pl2 ON pl2.section = pl.section AND pl2.from_sequence < pl.from_sequence
WHERE CURDATE() BETWEEN service.date_start and service.date_end AND operator.id = "TMTL"

Above is an SQL query that takes on average 0.1 - 0.3 seconds to run.

For some reason, as soon as a I append AND journey_day.day = 3 to the WHERE clause it adds an extra 4 seconds to the execution time. It is for this reason that I am posting this question.

Every column in the journey_day table has indexing of some sort.

id(INT 11) | journey(VARCHAR 128) | day(TINYINT 1)

The id is the primary key, the journey and day columns are both indexed. The table is joined with absolutely no problems, but as soon as the WHERE clause questions the table, execution time goes up by far too much for my liking.

The journey_day table is just there to determine which day the journey runs on. For example, if a journey runs on Wednesday and Thursdays, the journey would have its identifier in the journey column and then 3, same thing in another row but with 4.

Any ideas why this huge delay in execution time is happening?

EDIT

Result of an EXPLAIN prepended to the query: http://i.imgur.com/QrjZEHy.png

解决方案

Since you are using both the journey_day.journey and journey_day.day columns, a multi-column index would be more efficient than multiple single-column indexes.

From "How MySQL Uses Indexes" on MySQL Reference Manual:

Suppose that you issue the following SELECT statement:

mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;

If a multiple-column index exists on col1 and col2, the appropriate rows can be fetched directly. If separate single-column indexes exist on col1 and col2, the optimizer will attempt to use the Index Merge optimization (see Section 8.3.1.4, "Index Merge Optimization"), or attempt to find the most restrictive index by deciding which index finds fewer rows and using that index to fetch the rows.*

Thus, you could consider creating an index, as follows:

CREATE INDEX id_journey_day_id_day ON journey_day (journey, day);

in which case, the index would be used by the following query:

SELECT journey.id
FROM journey
JOIN journey_day ON journey_day.journey = journey.id AND journey_day.day = 3
JOIN service ON journey.service = service.id
JOIN operator ON operator.id = service.operator
JOIN pattern ON pattern.id = journey.pattern
JOIN pattern_link pl ON pl.section = pattern.section AND pl.from_stop = "370023292"
JOIN pattern_link pl2 ON pl2.section = pl.section AND pl2.from_sequence < pl.from_sequence
WHERE CURDATE() BETWEEN service.date_start and service.date_end AND operator.id = "TMTL";

这篇关于联接的WHERE子句使执行时间增加4秒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆