Postgres查询优化问题 [英] postgres query optimization problems

查看：86 发布时间：2020/5/30 0:43:05 postgresql

本文介绍了Postgres查询优化问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面是两个几乎相同的postgres查询，但产生的查询计划和执行时间截然不同。我假设第一个查询是快速的，因为只有196个form_id ='W40'的form_instance记录，而有7000个form_id ='W30L'的记录。但是，为什么从200条记录增加到7000条记录（在我看来相对较小）会导致查询时间如此惊人的增加？我试图以各种方式为数据建立索引，以加快速度，但是基本上是费解的。我怎样才能加快速度？（请注意，两个表的模式都包含在底部）。

Below are two postgres queries that are almost identical, but produce to vastly different query plans and execution times. I'm assuming that the first query is quick, because there are only 196 form_instance records where form_id = 'W40', whereas there are 7000 where form_id = 'W30L'. But why does a jump from 200 to 7000 records (which seems relatively minor to me) cause such a staggering increase in query time? I've tried to index the data in various ways to speed this up, but am basically flummoxed. How can I speed this up? (Note that schemas for both tables are included at the bottom).

explain analyze select form_id,form_instance_id,answer,field_id
from form_instances,field_instances 
where workflow_state = 'DRqueued' and form_instance_id = form_instances.id 
and field_id in ('Book_EstimatedDueDate','H_SubmittedDate','H_Ccode','miscarriage','miscarriage_of_multiple','stillbirth','AP_IUFD_of_multiple','maternal_death','birth_includes_transport','newborn_death','H_Pid','H_Mid1','H_Mid2','H_Mid3')
and (form_id = 'W40');

QUERY PLAN                                                                                                                                
 Nested Loop  (cost=0.00..70736.14 rows=4646 width=29) (actual time=0.000..20.000 rows=2399 loops=1)
   ->  Index Scan using form_id_and_workflow_state on form_instances  (cost=0.00..1041.42 rows=507 width=8) (actual time=0.000..0.000 rows=196 loops=1)
         Index Cond: (((form_id)::text = 'W40'::text) AND ((workflow_state)::text = 'DRqueued'::text))
   ->  Index Scan using index_field_instances_on_form_instance_id on field_instances  (cost=0.00..137.25 rows=17 width=25) (actual time=0.000..0.102 rows=12 loops=196)
         Index Cond: (field_instances.form_instance_id = form_instances.id)
         Filter: ((field_instances.field_id)::text = ANY ('{Book_EstimatedDueDate,H_SubmittedDate,H_Ccode,miscarriage,miscarriage_of_multiple,stillbirth,AP_IUFD_of_multiple,maternal_death,birth_includes_transport,newborn_death,H_Pid,H_Mid1,H_Mid2,H_Mid3}'::text[]))
 Total runtime: 30.000 ms
(7 rows)

explain analyze select form_id,form_instance_id,answer,field_id 
from form_instances,field_instances 
where workflow_state = 'DRqueued' and form_instance_id = form_instances.id 
and field_id in ('Book_EstimatedDueDate','H_SubmittedDate','H_Ccode','miscarriage','miscarriage_of_multiple','stillbirth','AP_IUFD_of_multiple','maternal_death','birth_includes_transport','newborn_death','H_Pid','H_Mid1','H_Mid2','H_Mid3') 
and (form_id = 'W30L');

QUERY PLAN                                                                                                                             
 Hash Join  (cost=34300.46..160865.40 rows=31045 width=29) (actual time=65670.000..74960.000 rows=102777 loops=1)
   Hash Cond: (field_instances.form_instance_id = form_instances.id)
   ->  Bitmap Heap Scan on field_instances  (cost=29232.57..152163.82 rows=531718 width=25) (actual time=64660.000..72800.000 rows=526842 loops=1)
         Recheck Cond: ((field_id)::text = ANY ('{Book_EstimatedDueDate,H_SubmittedDate,H_Ccode,miscarriage,miscarriage_of_multiple,stillbirth,AP_IUFD_of_multiple,maternal_death,birth_includes_transport,newborn_death,H_Pid,H_Mid1,H_Mid2,H_Mid3}'::text[]))
         ->  Bitmap Index Scan on index_field_instances_on_field_id  (cost=0.00..29099.64 rows=531718 width=0) (actual time=64630.000..64630.000 rows=594515 loops=1)
               Index Cond: ((field_id)::text = ANY ('{Book_EstimatedDueDate,H_SubmittedDate,H_Ccode,miscarriage,miscarriage_of_multiple,stillbirth,AP_IUFD_of_multiple,maternal_death,birth_includes_transport,newborn_death,H_Pid,H_Mid1,H_Mid2,H_Mid3}'::text[]))
   ->  Hash  (cost=5025.54..5025.54 rows=3388 width=8) (actual time=980.000..980.000 rows=10457 loops=1)
         ->  Bitmap Heap Scan on form_instances  (cost=90.99..5025.54 rows=3388 width=8) (actual time=10.000..950.000 rows=10457 loops=1)
               Recheck Cond: (((form_id)::text = 'W30L'::text) AND ((workflow_state)::text = 'DRqueued'::text))
               ->  Bitmap Index Scan on form_id_and_workflow_state  (cost=0.00..90.14 rows=3388 width=0) (actual time=0.000..0.000 rows=10457 loops=1)
                     Index Cond: (((form_id)::text = 'W30L'::text) AND ((workflow_state)::text = 'DRqueued'::text))
 Total runtime: 75080.000 ms

# \d form_instances                                        Table "public.form_instances"     Column      |            Type             |                          Modifiers                          
-----------------+-----------------------------+-------------------------------------------------------------
 id              | integer                     | not null default nextval('form_instances_id_seq'::regclass)
 form_id         | character varying(255)      | 
 created_at      | timestamp without time zone | 
 updated_at      | timestamp without time zone | 
 created_by_id   | integer                     | 
 updated_by_id   | integer                     | 
 workflow        | character varying(255)      | 
 workflow_state  | character varying(255)      | 
 validation_data | text                        | 
Indexes:
    "form_instances_pkey" PRIMARY KEY, btree (id)
    "form_id_and_workflow_state" btree (form_id, workflow_state)
    "index_form_instances_on_form_id" btree (form_id)
    "index_form_instances_on_workflow_state" btree (workflow_state)

# \d field_instances
                                        Table "public.field_instances"
      Column      |            Type             |                          Modifiers                           
------------------+-----------------------------+--------------------------------------------------------------
 id               | integer                     | not null default nextval('field_instances_id_seq'::regclass)
 form_instance_id | integer                     | 
 created_at       | timestamp without time zone | 
 updated_at       | timestamp without time zone | 
 created_by_id    | integer                     | 
 updated_by_id    | integer                     | 
 field_id         | character varying(255)      | 
 answer           | text                        | 
 state            | character varying(255)      | 
 explanation      | text                        | 
 idx              | integer                     | not null default 0
Indexes:
    "field_instances_pkey" PRIMARY KEY, btree (id)
    "field_instances__lower_answer" btree (lower(answer))
    "index_field_instances_on_answer" btree (answer)
    "index_field_instances_on_field_id" btree (field_id)
    "index_field_instances_on_field_id_and_answer" btree (field_id, answer)
    "index_field_instances_on_form_instance_id" btree (form_instance_id)
    "index_field_instances_on_idx" btree (idx)

Postgres查询优化问题 [英] postgres query optimization problems

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Postgres查询优化问题 [英] postgres query optimization problems

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭