Postgres在查询计划中使用了错误的索引 [英] Postgres uses wrong index in query plan

查看:178
本文介绍了Postgres在查询计划中使用了错误的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面我有两个几乎相同的查询,只有限制不同。
尽管如此,查询计划和执行时间完全不同。第一个查询比第二个查询慢+300倍。

Below I have 2 almost identical queries, only the limit is different. Nevertheless, the query plan and execution time are totally different. The first query is +300 times slower than the second one.

问题只发生在少量owner_ids上。拥有许多路线(+1000)的业主,最近都没有编辑过。
表路由包含2,806,976行。示例中的所有者有4,510个路由。

The problem only occurs for a small number of owner_ids. Owners with many routes (+1000), none of which has recently been edited. The table route contains 2,806,976 rows. The owner in the example has 4,510 routes.

数据库托管在具有34.2 GiB内存,4vCPU和预配置IOPS(实例类型db.m2)的服务器上的Amazon RDS上。 2xlarge)。

The database is hosted on Amazon RDS on a server with 34.2 GiB memory, 4vCPU and provisioned IOPS (instance type db.m2.2xlarge).

EXPLAIN ANALYZE SELECT
    id
FROM
    route
WHERE
    owner_id = 39127
ORDER BY
    edited_date DESC
LIMIT
    5

Query plan:
"Limit  (cost=0.43..5648.85 rows=5 width=12) (actual time=1.046..12949.436 rows=5 loops=1)"
"  ->  Index Scan Backward using route_i_edited_date on route  (cost=0.43..5368257.28 rows=4752 width=12) (actual time=1.042..12949.418 rows=5 loops=1)"
"        Filter: (owner_id = 39127)"
"        Rows Removed by Filter: 2351712"
"Total runtime: 12949.483 ms"

EXPLAIN ANALYZE SELECT
    id
FROM
    route
WHERE
    owner_id = 39127
ORDER BY
    edited_date DESC
LIMIT
    15

Query plan:
"Limit  (cost=13198.79..13198.83 rows=15 width=12) (actual time=37.781..37.821 rows=15 loops=1)"
"  ->  Sort  (cost=13198.79..13210.67 rows=4752 width=12) (actual time=37.778..37.790 rows=15 loops=1)"
"        Sort Key: edited_date"
"        Sort Method: top-N heapsort  Memory: 25kB"
"        ->  Index Scan using route_i_owner_id on route  (cost=0.43..13082.20 rows=4752 width=12) (actual time=0.039..32.425 rows=4510 loops=1)"
"              Index Cond: (owner_id = 39127)"
"Total runtime: 37.870 ms"

如何确保Postgres使用索引route_i_owner_id 。

How can I ensure that Postgres uses the index route_i_owner_id.

我已经尝试过以下事项:

I already tried the following things:


  • 增加统计数据for edited_date和owner_id

  • increasing statistics for edited_date and owner_id

ALTER TABLE route ALTER COLUMN owner_id SET STATISTICS 1000;
ALTER TABLE route ALTER COLUMN edited_date SET STATISTICS 1000;


  • 整个数据库的真空分析

  • vacuum analyse of whole database

    使用以下综合指数解决:

    Solved with following composite index:

    CREATE INDEX route_i_owner_id_edited_date
      ON public.route
      USING btree
      (owner_id, edited_date DESC);
    
    EXPLAIN ANALYZE SELECT
        id
    FROM
        route
    WHERE
        owner_id = 39127
    ORDER BY
        edited_date DESC
    LIMIT
        5
    
    "Limit  (cost=0.43..16.99 rows=5 width=12) (actual time=0.028..0.050 rows=5 loops=1)"
    "  ->  Index Scan using route_i_owner_id_edited_date on route  (cost=0.43..15746.74 rows=4753 width=12) (actual time=0.025..0.039 rows=5 loops=1)"
    "        Index Cond: (owner_id = 39127)"
    "Total runtime: 0.086 ms"
    


    推荐答案

    此查询开始变慢。它应该少于1秒。

    This query is to slow to begin with. It should take less than 1s.

    您的第一个示例首先使用edited_date索引对数据进行排序,然后过滤排序后的数据。

    Your first example uses the edited_date index to sort the data first, then filter the sorted data.

    你的第二个例子,对数据进行排序(似乎没有索引),然后应用索引扫描来获取实际的行。这两种方法看起来都很糟糕。

    Your second example, sorts the data (without index, it seems), then applies an index scan to fetch the actual rows. Both approaches seems bad.

    什么可能加速它,是owner_id和edited_date的复合索引,如果经常使用这种查询,这将是有意义的。这个索引也将取代其中一个索引,甚至两者都可以。

    What would probably speed it up, is a composite index of both owner_id and edited_date, which would make sense if this kind of query is used often. This index would also replace one of the other indexes, and perhaps even both.

    这篇关于Postgres在查询计划中使用了错误的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆