为什么PostgreSQL多次调用STABLE / IMMUTABLE函数? [英] Why is PostgreSQL calling my STABLE/IMMUTABLE function multiple times?

查看:271
本文介绍了为什么PostgreSQL多次调用STABLE / IMMUTABLE函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在PostgreSQL 9.1.2中优化一个复杂的查询,它调用了一些函数。这些函数被标记为STABLE或IMMUTABLE,并在查询中使用相同的参数多次调用。我认为PostgreSQL足够聪明,只需为每组输入调用一次即可 - 毕竟,这是STABLE和IMMUTABLE的关键,不是吗?但看起来功能被多次调用。我写了一个简单的函数来测试它,证实了它:

  CREATE OR REPLACE FUNCTION test_multi_calls1(one integer)
RETURNS integer
AS $ BODY $
BEGIN
RAISE NOTICE'调用%',一个;
RETURN one;
END;
$ BODY $语言plpgsql IMMUTABLE;


with data AS

SELECT 10 AS
UNION ALL SELECT 10
UNION ALL SELECT 20

SELECT test_multi_calls1(num)
FROM data;

输出:

  NOTICE:以10 
调用注意:以10
调用注意:以20
调用

为什么会发生这种情况,我怎么才能让它只执行一次函数?

解决方案

下面的扩展你的测试代码是提供信息的:

pre $ 创建或替换函数test_multi_calls1(一个整数)
RETURNS integer
AS $ BODY $
BEGIN
RAISE NOTICE'不可变%'调用;
RETURN one;
END;
$ BODY $语言plpgsql IMMUTABLE;
创建或替换函数test_multi_calls2(一个整数)
RETURNS整数
AS $ BODY $
BEGIN
RAISE NOTICE'使用%'调用易失性,一个;
RETURN one;
END;
$ BODY $语言plpgsql VOLATILE;

with data AS

SELECT 10 AS num
UNION ALL SELECT 10
UNION ALL SELECT 20

SELECT test_multi_calls1(num)
FROM data
where test_multi_calls2(40)= 40
and test_multi_calls1(30)= 30

OUTPUT:

 注意:使用30 
调用不可变注意:挥发性用40
来呼叫NOTICE:用10
呼叫不可变注意:使用40
呼叫易失性注意:用10
呼叫不可变注意:用40
调用易失性注意:不可变用20
调用

这里我们可以看到,在选择列表中,不可变函数被多次调用,在where子句中被调用一次,而volatile被称为三次。



重要的不是PostgreSQL只会调用 STABLE IMMUTABLE 函数使用相同的数据 - 您的示例清楚地表明,情况并非如此 - 它可能只能调用一次。或者当它需要调用一次易失性版本50次时,它会调用它两次,等等。



有不同的稳定性和不可变性的方法优势,具有不同的成本和收益。为了提供这种节省的建议,您应该使用select-lists来缓存结果,然后在缓存中返回缓存的结果或调用函数之前查找此缓存中的每个参数(或参数列表) -小姐。这比调用你的函数更加昂贵,即使在高比例的缓存命中的情况下(可能有0%的缓存命中意味着这个优化做了额外的工作,绝对没有增益)。它可以存储可能只是最后一个参数和结果,但也可能完全没用。



考虑到稳定和不可变的函数通常是最轻的函数。然而,使用where子句, test_multi_calls1 的不变性允许PostgreSQL实际上重新构造查询,给出的SQL:


对于每一行计算test_multi_calls1(30),如果结果是
等于30,则继续处理该行


完全不同的查询计划:


计算test_multi_calls1(30),如果它等于30,那么
继续查询,否则返回一个没有
的零行结果集任何进一步计算


这是PostgreSQL使用STABLE和IMMUTABLE的一种用法 - 不是缓存结果,b注意,test_multi_calls1(30)在test_multi_calls2(40)之前被调用,不管怎样,不管怎样,都可以在test_multi_calls2(40)之前调用test_multi_calls1(30)为了使它们出现在where子句中。这意味着如果第一次调用不会返回任何行(将 = 30 替换为 = 31 来测试),那么volatile函数根本不会被调用 - 无论哪个函数位于的哪一侧。



这种重写取决于不变性或稳定性。使用 where test_multi_calls1(30)!= num 查询重写将发生在不可变的情况下,但不仅仅是稳定的函数。使用其中test_multi_calls1(num)!= 30 它根本不会发生(多次调用),尽管还有其他优化可能:

仅包含STABLE和IMMUTABLE函数的表达式可用于索引扫描。包含VOLATILE函数的表达式不能。调用次数可能会减少也可能不会减少,但更重要的是,调用的结果将在查询的其余部分以更有效的方式使用(对于大型表格来说只是非常重要,但是它可以使得大量差异)。

总的来说,不要考虑备忘录方面的波动类别,而应该给PostgreSQL的查询规划者提供重组整个查询的机会在逻辑上相当(相同的结果),但效率更高。

I'm trying to optimise a complex query in PostgreSQL 9.1.2, which calls some functions. These functions are marked STABLE or IMMUTABLE and are called several times with the same arguments in the query. I assumed PostgreSQL would be smart enough to only call them once for each set of inputs - after all, that's the point of STABLE and IMMUTABLE, isn't it? But it appears that the functions are being called multiple times. I wrote a simple function to test this, which confirms it:

CREATE OR REPLACE FUNCTION test_multi_calls1(one integer)
RETURNS integer
AS $BODY$
BEGIN
    RAISE NOTICE 'Called with %', one;
    RETURN one;
END;
$BODY$ LANGUAGE plpgsql IMMUTABLE;


WITH data AS
(
    SELECT 10 AS num
    UNION ALL SELECT 10
    UNION ALL SELECT 20
)
SELECT test_multi_calls1(num)
FROM data;

Output:

NOTICE:  Called with 10
NOTICE:  Called with 10
NOTICE:  Called with 20

Why is this happening and how can I get it to only execute the function once?

解决方案

The following extension of your test code is informative:

CREATE OR REPLACE FUNCTION test_multi_calls1(one integer)
RETURNS integer
AS $BODY$
BEGIN
    RAISE NOTICE 'Immutable called with %', one;
    RETURN one;
END;
$BODY$ LANGUAGE plpgsql IMMUTABLE;
CREATE OR REPLACE FUNCTION test_multi_calls2(one integer)
RETURNS integer
AS $BODY$
BEGIN
    RAISE NOTICE 'Volatile called with %', one;
    RETURN one;
END;
$BODY$ LANGUAGE plpgsql VOLATILE;

WITH data AS
(
    SELECT 10 AS num
    UNION ALL SELECT 10
    UNION ALL SELECT 20
)
SELECT test_multi_calls1(num)
FROM data
where test_multi_calls2(40) = 40
and test_multi_calls1(30) = 30

OUTPUT:

NOTICE:  Immutable called with 30
NOTICE:  Volatile called with 40
NOTICE:  Immutable called with 10
NOTICE:  Volatile called with 40
NOTICE:  Immutable called with 10
NOTICE:  Volatile called with 40
NOTICE:  Immutable called with 20

Here we can see that while in the select-list the immutable function was called multiple times, in the where clause it was called once, while the volatile was called thrice.

The important thing isn't that PostgreSQL will only call a STABLE or IMMUTABLE function once with the same data - your example clearly shows that this is not the case - it's that it may call it only once. Or perhaps it will call it twice when it would have to call a volatile version 50 times, and so on.

There are different ways in which stability and immutability can be taken advantage of, with different costs and benefits. To provide the sort of saving you are suggesting it should make with select-lists it would have to cache the results, and then lookup each argument (or list of arguments) in this cache before either returning the cached result or calling function on a cache-miss. This would be more expensive than calling your function, even in the case where there was a high percentage of cache-hits (there could be 0% cache hits meaning this "optimisation" did extra work for absolutely no gain). It could store maybe just the last parameter and result, but again that could be completely useless.

This is especially so considering that stable and immutable functions are often the lightest functions.

With the where clause however, the immutability of test_multi_calls1 allows PostgreSQL to actually restructure the query from the plain meaning of the SQL given:

For every row calculate test_multi_calls1(30) and if the result is equal to 30 continue processing the row in question

To a different query plan entirely:

Calculate test_multi_calls1(30) and if it is equal to 30 then continue with the query otherwise return a zero row result-set without any further calculation

This is the sort of use that PostgreSQL makes of STABLE and IMMUTABLE - not the caching of results, but the rewriting of queries into different queries which are more efficient but give the same results.

Note also that test_multi_calls1(30) is called before test_multi_calls2(40) no matter what order they appear in the where clause. This means that if the first call results in no rows being returned (replace = 30 with = 31 to test) then the volatile function won't be called at all - again regardless to which is on which side of the and.

This particular sort of rewriting depends upon immutability or stability. With where test_multi_calls1(30) != num query re-writing will happen for immutable but not for merely stable functions. With where test_multi_calls1(num) != 30 it won't happen at all (multiple calls) though there are other optimisations possible:

Expressions containing only STABLE and IMMUTABLE functions can be used with index scans. Expressions containing VOLATILE functions cannot. The number of calls may or may not decrease, but much more importantly the results of the calls will then be used in a much more efficient way in the rest of the query (only really matters on large tables, but then it can make a massive difference).

In all, don't think of volatility categories in terms of memoisation, but rather in terms of giving PostgreSQL's query planner opportunities to restructure entire queries in ways that are logically equivalent (same results) but much more efficient.

这篇关于为什么PostgreSQL多次调用STABLE / IMMUTABLE函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆