Postgres存储的函数输入检查开销,解释计时结果 [英] Postgres stored function input-checking overhead, interpreting timing results

查看:103
本文介绍了Postgres存储的函数输入检查开销,解释计时结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

回答另一 EXCEPTION 昂贵。在这些情况下,我对Postgres的表现一无所知,并想尝试一些比较。克林(Klin)展示了如何使用(出色的) generate_series()函数来简化此操作。

While answering another question, Klin demonstrated an easy way of doing some loose timing tests. The question is "How expensive are exceptions?" There are mentions in the documentation and elsewhere that PL/PgSQL is slower than SQL for stored functions, and that EXCEPTION is expensive. I have no intuition about Postgres' performance in these situations, and figured I'd try out a few comparisons. Klin showed how to use the (wonderful) generate_series() function to make this easy.

这是必需的前言:


  • 我发誓我并没有为速度测试而战。我对此几乎没有兴趣。

  • I swear I'm not starting a fight about speed tests. I have less than no interest in that.

这些是松散的人工测试案例。我只是想让人们了解不同风格之间的比较。基本上,用于各种输入验证方法的存储函数的基本开销是多少。

These are loose, artificial tests cases. I'm just trying to get a vibe for how different styles compare to each other. Basically, what's the basic overhead in stored functions for various approaches to input validation.

SQL和PL / PgSQL不可互换,因此,这样做不太公平。 1:1进行比较。如果您可以使用纯SQL做某事,那就太好了。但这并不总是可能的。

SQL and PL/PgSQL aren't interchangeable, so it's not quite fair to compare them 1:1. If you can do something in pure SQL, great. But that's not always possible.

这些测试每个函数每次运行1,000,000次,以绝对地扩大执行时间的微小差异。

These tests run each function 1,000,000 times each to amplify what are in, in absolute terms, minuscule differences in execution time.

数字四舍五入到最接近的10 ...甚至引起误解。对于现代的CPU和现代的OS,在相同的运行中获得百分之几的可变性是正常的。

The numbers are rounded to the nearest 10...and even then, misleading. With modern CPUs and contemporary OSs, getting several % of variability over "identical" runs is normal.

所有这些,由于例程做了一些不同的事情,因此测试不能直接进行比较。因此,如果您对此问题感兴趣,则必须阅读代码。测试试图比较一些事情:

As important as all of that, the tests aren't directly comparable as the routines do somewhat different things. So, if you're interested in this question, you have to read the code. The tests attempt to compare a few things:


  • SQL与PL / PgSQL进行简单操作。

  • 未使用的 EXCEPTION 块的成本。

  • 未使用的 IF ... ELSE的成本... END IF 块。

  • EXCEPTION 块和 RAISE 检查输入参数。

  • IF ... ELSE ... END IF 块和 RAISE 来检查输入参数。

  • DOMAIN 的约束,用于使用错误的输入参数来短路呼叫。

  • SQL vs PL/PgSQL for a simple operation.
  • The cost of an unused EXCEPTION block.
  • The cost of an unused IF...ELSE...END IF block.
  • The cost of an EXCEPTION block and RAISE to check an input parameter.
  • The cost of an IF...ELSE...END IF block and RAISE to check an input parameter.
  • The cost of a DOMAIN-based constraint to short-circuit calls with a bad input parameter.

以下是1,000,000次迭代的执行时间摘要每个都使用PG 12.1:

Here's a summary of execution times for 1,000,000 iterations each using PG 12.1:

Language    Function                     Error     Milliseconds
SQL         test_sql                     Never             580
PL/PgSQL    test_simple                  Never            2250
PL/PgSQL    test_unused_exception_block  Never            4200
PL/PgSQL    test_if_that_never_catches   Never            2600
PL/PgSQL    test_if_that_catches         Never             310
PL/PgSQL    test_if_that_catches         Every time       2750
PL/PgSQL    test_exception_that_catches  Never            4230
PL/PgSQL    test_exception_that_catches  Every time       3950
PL/PgSQL    test_constraint              Never             310
PL/PgSQL    test_constraint              Every time       2380

注意:我改变了约束捕获测试的迭代次数,是的,它发生了变化。因此,似乎不会在第一个错误时就中断循环。

Note: I varied the # of iterations on the constraint catching tests and, yes, it changes. So it doesn't appear that the loop breaks on the first error.

如果您自己运行代码,则会得到不同的时间...并且整个过程的可变性多次运行非常高。因此,我认为,不是可以将数字用于某种意义上的事情。

If you run the code yourself, you'll get different times...and the variability across multiple runs is pretty high. So, not the kinds of numbers you can use for more than a sense of things, I think.

有人能从这里完全了解结果吗?计算出来的?在我的特殊情况下,以上所有数字均表示为绝对正确,这将使现实世界的差异为零。您需要将这些事情运行1000多次,才能获得一毫秒的差异(即得或接受)。我正在寻找错误检查某个方法的方法,这些方法在循环中不是百万次。我的职能部门将花费时间从事诸如搜索之类的实际工作,而我尝试的任何方法的开销都显得微不足道。对我而言,获胜者看起来像 test_if_that_catches 。即,在 BEGIN 开头的 IF 捕获错误的输入,然后使用 RAISE 返回报告。无论如何,这与我想要的构造方法都很好,它易于读取,并且以这种方式引发自定义异常很简单。

Does anyone see anything completely off about the results here, or how I calculated them? In my particular case, all of the numbers above read as "absolutely fine, it will make zero real-world difference." You need to run these things 1000+ times to even get a millisecond of difference, give-or-take. I'm looking at error-checking for methods that are called some...not a million times in a loop. My functions are going to spend their time doing real work, like searches, the overhead of any of the approaches I tried smells inconsequential For me, the winner looks like test_if_that_catches. Namely, an IF at the start of the BEGIN that catches bad inputs and then uses RAISE to return a report. That's a good match to how I like to structure methods anyway, it's readable, and it's simple to raise custom exceptions that way.

我将列出这些函数,然后然后是测试代码。

I'll list out the functions, and then the test code.

--------------------------------------------
-- DOMAIN: text_not_empty
--------------------------------------------
DROP DOMAIN IF EXISTS text_not_empty;

CREATE DOMAIN text_not_empty AS
    text
    NOT NULL
    CHECK (value <> '');

COMMENT ON DOMAIN text_not_empty IS
    'The string must not be empty';

--------------------------------------------
-- FUNCTION test_sql()
--------------------------------------------
drop function if exists test_sql();
create or replace function test_sql()
returns int as $$

select 1;
$$
LANGUAGE sql;

--------------------------------------------
-- FUNCTION test_simple()
--------------------------------------------
drop function if exists test_simple();
create or replace function test_simple()
returns int language plpgsql as $$
begin
    return 1;
end $$;

--------------------------------------------
-- FUNCTION test_unused_exception_block()
--------------------------------------------
drop function if exists test_unused_exception_block();
create or replace function test_unused_exception_block()
returns int language plpgsql as $$
begin
    return 1;
exception when others then
    raise exception 'ugh';
-- note that any exception is never trapped
-- anyway the function is much more expensive
-- see execution time in query plans
end $$;

--------------------------------------------
-- FUNCTION test_if_that_never_catches()
--------------------------------------------
drop function if exists test_if_that_never_catches();
create or replace function test_if_that_never_catches()
returns int language plpgsql as $$
begin
if 1 > 2 then
    raise exception 'You have an unusually high value for 1';
    -- This never happens, I'm following Klin's previous example,
    -- just trying to measure the overhead of the if...then..end if.
end if;

    return 1;
end $$;

--------------------------------------------
-- FUNCTION test_if_that_catches()
--------------------------------------------
drop function if exists test_if_that_catches(text_not_empty);
create or replace function test_if_that_catches(text_not_empty)
returns int language plpgsql as $$
begin
if $1 = '' then
    raise exception 'The string must not be empty';
end if;

    return 1;
end $$;

--------------------------------------------
-- FUNCTION test_exception_that_catches()
--------------------------------------------
drop function if exists test_exception_that_catches(text);
create or replace function test_exception_that_catches(text)
returns int language plpgsql as $$
begin
    return 1;
exception when others then
    raise exception 'The string must not be empty';
end $$;

--------------------------------------------
-- FUNCTION test_constraint()
--------------------------------------------
drop function if exists test_constraint(text_not_empty);
create or replace function test_constraint(text_not_empty)
returns int language plpgsql as $$
begin
    return 1;
end $$;


--------------------------------------------
-- Tests
--------------------------------------------
-- Run individually and look at execution time

explain analyse
select sum(test_sql())
from generate_series(1, 1000000);

explain analyse
select sum(test_simple())
from generate_series(1, 1000000);

explain analyse
select sum(test_unused_exception_block())
from generate_series(1, 1000000);

explain analyse
select sum(test_if_that_never_catches())
from generate_series(1, 1000000);

explain analyse
select sum(test_if_that_catches('')) -- Error thrown on every case
from generate_series(1, 1000000);

explain analyse
select sum(test_if_that_catches('a')) -- Error thrown on no cases
from generate_series(1, 1000000);

explain analyse
select sum(test_exception_that_catches(''))-- Error thrown on every case
from generate_series(1, 1000000);

explain analyse
select sum(test_exception_that_catches('a')) -- Error thrown on no cases
from generate_series(1, 1000000);

explain analyse
select sum(test_constraint('')) -- Error thrown on no cases
from generate_series(1, 1000000);

explain analyse
select sum(test_constraint('a')) -- Error thrown on no cases
from generate_series(1, 1000000); 


推荐答案

您的测试对我来说一切正常比较是验证各种方法正确性的速度。毫不奇怪,避免在任何地方调用该函数的方法都是成功的。

Your tests look OK to me if all that you want to compare is the speed of various methods to verify the correctness of inputs. Unsurprisingly, the methods that avoid calling the function at all place win.

我同意您的看法,两者之间的差异几乎无关紧要。检查输入不是决定您的功能是否有效的因素,如果该功能执行任何实际工作,则会迷失自我。

I concur with you that the difference is mostly irrelevant. Checking inputs is not what will decide if your functions are efficient or not, that will get lost in the noise if the function does any real work.

您的努力是英勇的,但是您可能会花更多的时间在调整函数将要执行的SQL语句上。

Your effort is valiant, but your time might be spent better on tuning the SQL statements that the function is going to execute.

这篇关于Postgres存储的函数输入检查开销,解释计时结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆