为什么在过滤时使用相同的字段会导致不同的执行时间?(不同的索引用法) [英] Why using same field when filtering cause different execution time? (different index usage)
问题描述
当我通过 agreement_id
运行查询和过滤时,它很慢,
但是当我按别名 id
过滤时,它很快.(看查询的结尾)
When I run query and filter by agreement_id
it is slow,
but when I filter by an alias id
it is fast. (Look at the end of the query)
为什么过滤时使用相同的字段会导致不同的执行时间?
解释分析
的链接:
差异从#20 开始:使用不同索引的地方:索引条件:(o.sys_period @> sys_time())
VS 索引条件:(o.agreement_id = 38)
Difference start at #20: Where different indexes are used:
Index Cond: (o.sys_period @> sys_time())
VS Index Cond: (o.agreement_id = 38)
附注.如果我能联系到此功能的开发者就好了(我还有一个类似的问题)
PS. It would be nice if I can contact to developer of this feature (I have one more similar problem)
UPD 我做了一些实验.当我从查询中删除窗口函数时,它在任何情况下都可以快速运行.那么为什么窗口函数在某些情况下会停止使用索引?如何逃避/解决这个问题?
UPD I did some experiments. when I remove window functions from my query it works fast in any case. So why window function stop index usage in some cases? How to escape/workaround that?
dbfiddle 使用最少的测试用例
服务器版本为 v13.1
完整查询:
dbfiddle with minimal test case
Server version is v13.1
Full query:
WITH gconf AS
-- https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-SELECT
NOT MATERIALIZED -- force it to be merged into the parent query
-- it gives a net savings because each usage of the WITH query needs only a small part of the WITH query's full output.
( SELECT
ocd.*,
tstzrange( '2021-05-01', '2021-05-01', '[]') AS acc_period,
(o).agreement_id AS id, -- Required to passthrough WINDOW FUNCTION
(o).id AS order_id,
(ic).consumed_period AS consumed_period,
dense_rank() OVER ( PARTITION BY (o).agreement_id, (o).id ORDER BY (ic).consumed_period ) AS nconf,
row_number() OVER ( wconf ORDER BY (c).sort_order NULLS LAST ) AS nitem,
(sum( ocd.item_cost ) OVER wconf)::numeric( 10, 2) AS conf_cost,
max((ocd.ic).consumed) OVER wconf AS consumed,
CASE WHEN true
THEN (sum( ocd.item_suma ) OVER wconf)::numeric( 10, 2 )
ELSE (sum( ocd.item_cost ) OVER wconf)::numeric( 10, 2 )
END AS conf_suma
FROM order_cost_details( tstzrange( '2021-05-01', '2021-05-01', '[]') ) ocd
WHERE true OR (ocd.ic).consumed_period @> lower( tstzrange( '2021-05-01', '2021-05-01', '[]') )
WINDOW wconf AS ( PARTITION BY (o).agreement_id, (o).id, (ic).consumed_period )
),
gorder AS (
SELECT *,
(conf_suma/6)::numeric( 10, 2 ) as conf_nds,
sum( conf_suma ) FILTER (WHERE nitem = 1) OVER worder AS order_suma
FROM gconf
WINDOW worder AS ( PARTITION BY gconf.id, (o).id )
-- TODO: Ask PG developers: Why changing to (o).agreement_id slows down query?
-- WINDOW worder AS ( PARTITION BY (o).agreement_id, (o).id )
)
SELECT
u.id, consumed_period, nconf, nitem,
(c).id as item_id,
COALESCE( (c).sort_order, pd.sort_order ) as item_order,
COALESCE( st.display, st.name, rt.display, rt.name ) as item_name,
COALESCE( item_qty, (c).amount/rt.unit ) as item_qty,
COALESCE( (p).label, rt.label ) as measure,
item_price, item_cost, item_suma,
conf_cost, consumed, conf_suma, conf_nds, order_suma,
(order_suma/6)::numeric( 10, 2 ) as order_nds,
sum( conf_suma ) FILTER (WHERE nitem = 1 ) OVER wagreement AS total_suma,
sum( (order_suma/6)::numeric( 10, 2 ) ) FILTER (WHERE nitem = 1 AND nconf = 1) OVER wagreement AS total_nds,
pkg.id as package_id,
pkg.link_1c_id as package_1c_id,
COALESCE( pkg.display, pkg.name ) as package,
acc_period
FROM gorder u
LEFT JOIN resource_type rt ON rt.id = (c).resource_type_id
LEFT JOIN service_type st ON st.id = (c).service_type_id
LEFT JOIN package pkg ON pkg.id = (o).package_id
LEFT JOIN package_detail pd ON pd.package_id = (o).package_id
AND pd.resource_type_id IS NOT DISTINCT FROM (c).resource_type_id
AND pd.service_type_id IS NOT DISTINCT FROM (c).service_type_id
-- WHERE (o).agreement_id = 38 -- slow
WHERE u.id = 38 -- fast
WINDOW wagreement AS ( PARTITION BY (o).agreement_id )
推荐答案
作为解决问题的方法,我们可以另外 SELECT
为 PARTITION BY
表达式中使用的列添加别名.然后PG应用优化并使用索引.
As problem workaround we can additionally SELECT
an alias for column used at PARTITION BY
expression. Then PG apply optimization and use index.
问题的答案可能是:如果使用复合类型,PG 不会应用优化.请注意它的工作原理:
The answer to the question could be: PG does not apply optimization if composite type is used. Notice as it works:
PARTITION | FILTER | IS USED?
------------------------------
ALIAS | ORIG | NO
ALIAS | ALIAS | YES
ORIG | ALIAS | NO
ORIG | ORIG | NO
create table agreement ( ag_id int, name text, cost numeric(10,2) );
create index ag_idx on agreement (ag_id);
insert into agreement (ag_id, name, cost) values ( 1, '333', 22 ),
(1,'333', 33), (1, '333', 7), (2, '555', 18 ), (2, '555', 2), (3, '777', 4);
select * from agreement;
create function initial ()
returns table( agreement_id int, ag agreement ) language sql stable AS $$
select ag_id, t from agreement t;
$$;
select * from initial() t;
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
select
*,
sum( (t.ag).cost ) over ( partition by agreement_id ) as total
from initial() t
)
select * from totals_by_ag t
where (t.ag).ag_id = 1; -- index is NOT USED
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
select
*,
sum( (t.ag).cost ) over ( partition by agreement_id ) as total
from initial() t
)
select * from totals_by_ag t
where agreement_id = 1; -- index is used when alias for column is used
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
select
*,
sum( (t.ag).cost ) over ( partition by (t.ag).ag_id ) as total --renamed
from initial() t
)
select * from totals_by_ag t
where agreement_id = 1; -- index is NOT USED because grouping by original column
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
select
*,
sum( (t.ag).cost ) over ( partition by (t.ag).ag_id ) as total --renamed
from initial() t
)
select * from totals_by_ag t
where (t.ag).ag_id = 1; -- index is NOT USED even if at both cases original column
这篇关于为什么在过滤时使用相同的字段会导致不同的执行时间?(不同的索引用法)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!