postgres 函数:IMMUTABLE 何时会损害性能? [英] postgres functions: when does IMMUTABLE hurt performance?

查看:45
本文介绍了postgres 函数:IMMUTABLE 何时会损害性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Postgres 文档

为了获得最佳优化结果,您应该使用对其有效的最严格的波动率类别标记您的函数.

For best optimization results, you should label your functions with the strictest volatility category that is valid for them.

但是,我似乎有一个例子说明情况并非如此,我想了解发生了什么.(背景:我正在运行 postgres 9.2)

However, I seem to have an example where this is not the case, and I'd like to understand what's going on. (Background: I'm running postgres 9.2)

我经常需要将表示为整数秒的时间转换为日期.我写了一个函数来做到这一点:

I often need to convert times expressed as integer numbers of seconds to dates. I've written a function to do this:

CREATE OR REPLACE FUNCTION 
  to_datestamp(time_int double precision) RETURNS date AS $$
  SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL;

让我们将性能与其他相同的函数进行比较,并将波动性设置为 IMMUTABLE 和 STABLE:

Let's compare performance to otherwise identical functions, with volatility set to IMMUTABLE and to STABLE:

CREATE OR REPLACE FUNCTION 
  to_datestamp_immutable(time_int double precision) RETURNS date AS $$
  SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL IMMUTABLE;

CREATE OR REPLACE FUNCTION 
  to_datestamp_stable(time_int double precision) RETURNS date AS $$
  SELECT date_trunc('day', to_timestamp($1))::date;
$$ LANGUAGE SQL STABLE;

为了测试这一点,我将创建一个包含 10^6 个随机整数的表格,这些整数对应于 2010-01-01 和 2015-01-01 之间的时间

To test this, I'll create a table of 10^6 random integers corresponding to times between 2010-01-01 and 2015-01-01

CREATE TEMPORARY TABLE random_times AS
  SELECT 1262304000 + round(random() * 157766400) AS time_int 
  FROM generate_series(1, 1000000) x;

最后,我会调用这个表上的两个函数;在我的特定盒子上,原始版本需要约 6 秒,不可变版本需要约 33 秒,稳定版本需要约 6 秒.

Finally, I'll time calling the two functions on this table; on my particular box, the original takes ~6 seconds, the immutable version takes ~33 seconds, and the stable version takes ~6 seconds.

EXPLAIN ANALYZE SELECT to_datestamp(time_int) FROM random_times;

Seq Scan on random_times  (cost=0.00..20996.62 rows=946950 width=8) 
  (actual time=0.150..5493.722 rows=1000000 loops=1)
Total runtime: 6258.827 ms


EXPLAIN ANALYZE SELECT to_datestamp_immutable(time_int) FROM random_times;

Seq Scan on random_times  (cost=0.00..250632.00 rows=946950 width=8) 
  (actual time=0.211..32209.964 rows=1000000 loops=1)
Total runtime: 33060.918 ms


EXPLAIN ANALYZE SELECT to_datestamp_stable(time_int) FROM random_times;
Seq Scan on random_times  (cost=0.00..20996.62 rows=946950 width=8)
  (actual time=0.086..5295.608 rows=1000000 loops=1)
Total runtime: 6063.498 ms

这是怎么回事?例如,由于传递给函数的参数不太可能重复,postgres 是否会花时间缓存结果,而这实际上并没有帮助?

What's going on here? E.g., is postgres spending time caching results when that won't actually be helpful since the arguments passed to the function are unlikely to repeat?

(我正在运行 postgres 9.2.)

(I'm running postgres 9.2.)

谢谢!

更新

感谢 Craig Ringer 这已在 pgsql-performance 邮件列表.亮点:

Thanks to Craig Ringer this has been discussed on the pgsql-performance mailing list. Highlights:

汤姆·莱恩说

[ shrug... ] 使用 IMMUTABLE 对函数的可变性撒谎(在这种情况下,date_trunc)是一个坏主意.很可能会导致错误答案,不要介意性能问题.在这种特殊情况下,我想象一下性能问题来自抑制了选项内联函数体......但你应该更担心在其他情况下,您是否没有得到完全虚假的答案.

[ shrug... ] Using IMMUTABLE to lie about the mutability of a function (in this case, date_trunc) is a bad idea. It's likely to lead to wrong answers, never mind performance issues. In this particular case, I imagine the performance problem comes from having suppressed the option to inline the function body ... but you should be more worried about whether you aren't getting flat-out bogus answers in other cases.

Pavel Stehule 说

如果我理解,使用的 IMMUTABLE 标志会禁用内联.你看到的是SQL 评估溢出.我的规则是 - 尽可能不要在 SQL 函数中使用标志.

If I understand, a used IMMUTABLE flag disables inlining. What you see, is SQL eval overflow. My rule is - don't use flags in SQL functions, when it is possible.

推荐答案

问题在于 to_timestamp 返回带时区的时间戳.如果将 to_timestamp 函数替换为没有时区的手动"计算,则性能没有差异

The problem is that to_timestamp returns timestamp with time zone. If the to_timestamp function is replaced with a "manual" calculation without time zone there is no difference in performance

create or replace function to_datestamp_stable(
    time_int double precision
) returns date as $$
  select date_trunc('day', timestamp 'epoch' + $1 * interval '1 second')::date;
$$ language sql stable;

explain analyze
select to_datestamp_stable(a)
from generate_series(1, 1000000) s (a);
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series s  (cost=0.00..22.50 rows=1000 width=4) (actual time=96.962..433.562 rows=1000000 loops=1)
 Total runtime: 459.531 ms

create or replace function to_datestamp_immutable(
    time_int double precision
) returns date as $$
  select date_trunc('day', timestamp 'epoch' + $1 * interval '1 second')::date;
$$ language sql immutable;

explain analyze
select to_datestamp_immutable(a)
from generate_series(1, 1000000) s (a);
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series s  (cost=0.00..22.50 rows=1000 width=4) (actual time=94.188..433.492 rows=1000000 loops=1)
 Total runtime: 459.434 ms

使用to_timestamp

create or replace function to_datestamp_stable(
    time_int double precision
) returns date as $$
  select date_trunc('day', to_timestamp($1))::date;
$$ language sql stable;

explain analyze
select to_datestamp_stable(a)
from generate_series(1, 1000000) s (a);
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series s  (cost=0.00..20.00 rows=1000 width=4) (actual time=91.924..3059.570 rows=1000000 loops=1)
 Total runtime: 3103.655 ms

create or replace function to_datestamp_immutable(
    time_int double precision
) returns date as $$
  select date_trunc('day', to_timestamp($1))::date;
$$ language sql immutable;

explain analyze
select to_datestamp_immutable(a)
from generate_series(1, 1000000) s (a);
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series s  (cost=0.00..262.50 rows=1000 width=4) (actual time=92.639..20083.920 rows=1000000 loops=1)
 Total runtime: 20149.311 ms

这篇关于postgres 函数:IMMUTABLE 何时会损害性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆