为什么即使有多列索引,我的多列查询也比相应的单列查询慢得多? [英] Why is my multi-column query dramatically slower than the corresponding single-column queries, even with a multi-column index?

查看:127
本文介绍了为什么即使有多列索引,我的多列查询也比相应的单列查询慢得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下查询:

SELECT * 
from stop_times 
WHERE (departure_time BETWEEN '02:41' AND '05:41' 
       OR departure_time BETWEEN '26:41' AND '29:41') 
    AND stop_times.stop_id IN(51511,51509,51508,51510,6,53851,51522,51533)

在〜800ms内返回134行.如果我将其拆分:

that returns 134 rows in ~800ms. If I split it:

SELECT * 
from stop_times 
WHERE (departure_time BETWEEN '02:41' AND '05:41' 
       OR departure_time BETWEEN '26:41' AND '29:41')

在约10ms内返回约11万行,并且

returns ~110k rows in ~10ms and

SELECT * 
from stop_times 
WHERE stop_times.stop_id IN(51511,51509,51508,51510,6,53851,51522,51533)

在〜100ms内返回〜5k行.

returns ~5k rows in ~100ms.

我尝试同时使用多列索引(departure_time和stop_id)以及2个单独的索引,但是无论哪种情况,第一个查询似乎都不会少于800毫秒.我的stop_times表大约有350万行.有什么我可能会丢失的东西,可以大大加快第一个查询的速度吗?

I tried using both a multi-column index (departure_time and stop_id) as well as 2 separate indexes, but in either case the first query can't seem to take less than ~800ms. My stop_times table has about 3.5M rows. Is there anything I could be missing and that would significantly speed up that first query?

更新1:创建表格:

CREATE TABLE `stop_times` (
  `trip_id` varchar(20) DEFAULT NULL,
  `departure_time` time DEFAULT NULL,
  `stop_id` varchar(20) DEFAULT NULL,
  KEY `index_stop_times_on_trip_id` (`trip_id`),
  KEY `index_stop_times_on_departure_time_and_stop_id` (`departure_time`,`stop_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

不幸的是,stop_id和trip_id是varchars而不是整数,这超出了我的控制范围...

stop_id and trip_id being varchars instead of integers is beyond my control unfortunately...

更新2: EXPLAIN用于departure_time, stop_id多列索引:

select_type: SIMPLE
type: range
rows: 239084

EXPLAIN用于stop_id, departure_time多列索引:

select_type: SIMPLE
type: range
rows: 141

更新3: EXPLAIN表示IN(51511,51509,51508,51510,6,53851,51522,51533)

select_type: SIMPLE
type: ALL
rows: 3556973 (lol)

EXPLAIN表示IN("51511","51509","51508","51510","6","53851","51522","51533")

select_type: SIMPLE
type: range
rows: 141

推荐答案

您是否创建了索引stop_id, departure_time?因为departure_time, stop_id绝对不会做任何事情.

Did you create an index stop_id, departure_time? Because departure_time, stop_id will do absolutely nothing.

这是一个真正的艰巨的任务-处理索引:(

This is a really hard one - it has every possible bad thing for dealing with indexes :(

您有一个范围,一个OR或一个非连续的IN-不会比这更糟.

You have a range, an OR and a non contiguous IN - it doesn't get worse than that.

尝试使用stop_id, departure_time,如果这样做没有帮助,那么除了切换到PostgreSQL以外,您无能为力.

Try stop_id, departure_time and if it doesn't help then there is nothing much you can do short of switching to PostgreSQL.

您还可以尝试将查询重写为:

You can also try rewriting the query as:

SELECT * 
from stop_times 
WHERE ( stop_times.stop_id IN(51511,51509,51508,51510,6,53851,51522,51533)
      AND departure_time BETWEEN '02:41' AND '05:41'
      )
   OR ( stop_times.stop_id IN(51511,51509,51508,51510,6,53851,51522,51533)
      AND departure_time BETWEEN '26:41' AND '29:41' 
      ) 

或:

    SELECT * 
    from stop_times 
    WHERE ( stop_times.stop_id IN(51511,51509,51508,51510,6,53851,51522,51533)
          AND departure_time BETWEEN '02:41' AND '05:41'
          )
UNION ALL
    SELECT * 
    from stop_times 
    WHERE ( stop_times.stop_id IN(51511,51509,51508,51510,6,53851,51522,51533)
          AND departure_time BETWEEN '26:41' AND '29:41' 
          )

这篇关于为什么即使有多列索引,我的多列查询也比相应的单列查询慢得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆