Postgres的计划执行时间不成比例 [英] Postgres' planning takes unproportional time for execution

查看:74
本文介绍了Postgres的计划执行时间不成比例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Amazon RDS上运行的postgres 9.6。

postgres 9.6 running on amazon RDS.

我有2张桌子:


  1. 汇总事件-具有6个键(id)的大表

  2. 活动元数据-具有广告系列定义的小表。

我加入2以过滤诸如广告系列名称之类的元数据。

I join the 2 in order to filter on metadata like campaign-name.

该查询用于获取按广告系列显示的细分报告渠道和日期(日期是每天)。

The query is in order to get a report of displayed breakdown by campaign channel and date ( date is daily ).

没有FK也不为空。报告表每个广告系列每天有多行(因为聚合基于6个属性键)。

No FK and not null. The report table has multiple lines per day per campaigns ( because the aggregation is based on 6 attribute key ).

当我加入时,查询计划增长到10s(而300ms)

When i join , query plan grow to 10s ( vs 300ms)

explain analyze select c.campaign_channel as channel,date as day , sum( displayed )  as displayed
from report_campaigns c
left join events_daily r on r.campaign_id = c.c_id
where  provider_id = 7726 and c.p_id = 7726 and c.campaign_name <> 'test'
and date >= '20170513 12:00' and date <= '20170515 12:00'
group by c.campaign_channel,date;
                                                                                         QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=71461.93..71466.51 rows=229 width=22) (actual time=104.189..114.788 rows=6 loops=1)
   Group Key: c.campaign_channel, r.date
   ->  Sort  (cost=71461.93..71462.51 rows=229 width=18) (actual time=100.263..106.402 rows=31205 loops=1)
         Sort Key: c.campaign_channel, r.date
         Sort Method: quicksort  Memory: 3206kB
         ->  Hash Join  (cost=1092.52..71452.96 rows=229 width=18) (actual time=22.149..86.955 rows=31205 loops=1)
               Hash Cond: (r.campaign_id = c.c_id)
               ->  Append  (cost=0.00..70245.84 rows=29948 width=20) (actual time=21.318..71.315 rows=31205 loops=1)
                     ->  Seq Scan on events_daily r  (cost=0.00..0.00 rows=1 width=20) (actual time=0.005..0.005 rows=0 loops=1)
                           Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone) AND (provider_id =
                     ->  Bitmap Heap Scan on events_daily_20170513 r_1  (cost=685.36..23913.63 rows=1 width=20) (actual time=17.230..17.230 rows=0 loops=1)
                           Recheck Cond: (provider_id = 7726)
                           Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
                           Rows Removed by Filter: 13769
                           Heap Blocks: exact=10276
                           ->  Bitmap Index Scan on events_daily_20170513_full_idx  (cost=0.00..685.36 rows=14525 width=0) (actual time=2.356..2.356 rows=13769 loops=1)
                                 Index Cond: (provider_id = 7726)
                     ->  Bitmap Heap Scan on events_daily_20170514 r_2  (cost=689.08..22203.52 rows=14537 width=20) (actual time=4.082..21.389 rows=15281 loops=1)
                           Recheck Cond: (provider_id = 7726)
                           Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
                           Heap Blocks: exact=10490
                           ->  Bitmap Index Scan on events_daily_20170514_full_idx  (cost=0.00..685.45 rows=14537 width=0) (actual time=2.428..2.428 rows=15281 loops=1)
                                 Index Cond: (provider_id = 7726)
                     ->  Bitmap Heap Scan on events_daily_20170515 r_3  (cost=731.84..24128.69 rows=15409 width=20) (actual time=4.297..22.662 rows=15924 loops=1)
                           Recheck Cond: (provider_id = 7726)
                           Filter: ((date >= '2017-05-13 12:00:00'::timestamp without time zone) AND (date <= '2017-05-15 12:00:00'::timestamp without time zone))
                           Heap Blocks: exact=11318
                           ->  Bitmap Index Scan on events_daily_20170515_full_idx  (cost=0.00..727.99 rows=15409 width=0) (actual time=2.506..2.506 rows=15924 loops=1)
                                 Index Cond: (provider_id = 7726)
               ->  Hash  (cost=1085.35..1085.35 rows=574 width=14) (actual time=0.815..0.815 rows=582 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 37kB
                     ->  Bitmap Heap Scan on report_campaigns c  (cost=12.76..1085.35 rows=574 width=14) (actual time=0.090..0.627 rows=582 loops=1)
                           Recheck Cond: (p_id = 7726)
                           Filter: ((campaign_name)::text <> 'test'::text)
                           Heap Blocks: exact=240
                           ->  Bitmap Index Scan on report_campaigns_provider_id  (cost=0.00..12.62 rows=577 width=0) (actual time=0.062..0.062 rows=582 loops=1)
                                 Index Cond: (p_id = 7726)
 Planning time: 9651.605 ms
 Execution time: 115.092 ms


result:
 channel  |         day         | displayed
----------+---------------------+-----------
 Pin      | 2017-05-14 00:00:00 |   43434
 Pin      | 2017-05-15 00:00:00 |   3325325235


推荐答案

在我看来,这是因为求和强迫

I seems to me this is because of summation forcing pre-computation before left joining.

解决方案可能是在左连接和求和之前在两个嵌套的子SELECT中强加过滤WHERE子句。

Solution could be to impose filtering WHERE clauses in two nested sub-SELECT prior to left-joining and summation.

希望这项工作有效:

SELECT channel, day, sum( displayed )
FROM
(SELECT campaign_channel AS channel, date AS day, displayed, p_id AS c_id
 FROM report_campaigns WHERE p_id = 7726 AND campaign_name <> 'test' AND date >= '20170513 12:00' AND date <= '20170515 12:00') AS c,
(SELECT * FROM events_daily WHERE campaign_id = 7726) AS r
LEFT JOIN r.campaign_id = c.c_id
GROUP BY channel, day;

这篇关于Postgres的计划执行时间不成比例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆