如何在PostgreSQL中使用游标来提高函数的性能? [英] How to improve performance of a function with cursors in PostgreSQL?
问题描述
我有两个嵌套游标的函数。外部游标从源中获取客户的付款细节,并基于一些业务逻辑插入到目标中。
I have function with two nested cursors. The outer cursor gets payment details of a customer from the source and inserts into the target based on some business logic. The inner cursor takes the payment details of each payment, it happens one after another.
付款表格大约有125000行,大约有335000行表示付款详情。所有这些行都将迁移到目标表。执行该函数需要两个小时,数据库CPU使用率上升到99%。
我正在使用PostgreSQL 9.2。
The payments table has about 125000 rows, and about 335000 rows for payment details. All of these rows are to be migrated to a target table. Executing the function takes over two hours and the database CPU usage goes up to 99%.
I am working with PostgreSQL 9.2.
我如何提高功能的性能?
How can I improve the performance of the function?
我使用的代码:
CREATE OR REPLACE FUNCTION ccdb_dummy.o_payments1(a integer)
RETURNS void AS
$BODY$
DECLARE
cursor_1 refcursor;
cursor_2 refcursor;
rowcount integer;
rowcount1 integer;
payment_no bigint;
query_1 character varying(250);
lc_cin_num bigint;
test character varying(50);
t_payments ccdb_stg.o_payments_stg%ROWTYPE;
t_payments_details ccdb_stg.o_payment_head_dtls_stg%ROWTYPE;
BEGIN
rowcount := 0;
open cursor_1 for select * from ccdb_stg.o_payments_stg WHERE section_code = a;
select count(1) into rowcount from ccdb_stg.o_payments_stg WHERE section_code = a;
for i IN 1..rowcount loop
fetch cursor_1 into t_payments;
payment_no= nextval('ccdb_stg.payments_seq');
select cin into lc_cin_num from ccdb_dummy.consumers a where a.consumer_num = t_payments.consumer_num;
insert into ccdb_dummy.payments(payment_id,receipt_id,source_system_flag,cin,consumer_nbr,cust_connection_id,cust_type_flg,receipt_type_id,mop_code,mop_details,coll_effect_date,coll_entry_date,receipt_num,receipt_amt,receipt_loc_flg,receipt_date,cancel_flag,acc_type_id,cust_section_code,coll_section_code,remarks,pm_paydate,pm_amount,ref_transaction_id,creation_dt,created_by) values(payment_no,t_payments.receipt_id,t_payments.origin_flag,lc_cin_num,t_payments.consumer_num,t_payments.cust_connection_id,t_payments.cust_type_flag,t_payments.receipt_type_id,t_payments.mop_id,t_payments.mop_details,t_payments.coll_effect_date,t_payments.coll_entry_date,t_payments.receipt_num,t_payments.receipt_amt,t_payments.receipt_flag,t_payments.receipt_date,t_payments.cancel_flag,t_payments.acc_type_flag,t_payments.cust_section_code,t_payments.coll_section_code,t_payments.remarks,t_payments.pm_paydate,t_payments.pm_amount,null,now(),'system');
select count(1) into rowcount1 from ccdb_stg.o_payment_head_dtls_stg a where a.mbc_receipt_id = t_payments.receipt_id;
open cursor_2 for select * from ccdb_stg.o_payment_head_dtls_stg a where a.mbc_receipt_id = t_payments.receipt_id;
for i IN 1..rowcount1 loop
fetch cursor_2 into t_payments_details;
insert into ccdb_dummy.payment_head_dtls(payment_id,mbc_receipt_id,charge_head_code,amount,tariff_id,creation_dt,created_by)
values (payment_no,t_payments_details.mbc_receipt_id,t_payments_details.charge_head_code,t_payments_details.amount,t_payments_details.tariff_id,now(),'system');
end loop;
close cursor_2;
end loop;
close cursor_1;
END;
$BODY$
LANGUAGE plpgsql;
推荐答案
在第一步,我彻底简化了程序代码:
In a first step, I radically simplified your procedural code:
CREATE OR REPLACE FUNCTION ccdb_dummy.o_payments1(a integer)
RETURNS void AS
$func$
DECLARE
t record;
t1 record;
BEGIN
FOR t IN
SELECT *
,nextval('ccdb_stg.payments_seq') AS payment_no
,c.cin
FROM ccdb_stg.o_payments_stg p
LEFT JOIN ccdb_dummy.consumers c USING (consumer_num)
WHERE p.section_code = $1
LOOP
INSERT INTO ccdb_dummy.payments(payment_id,receipt_id,source_system_flag,cin, ... ,pm_amount,ref_transaction_id,creation_dt,created_by)
VALUES(t.payment_no,t.receipt_id,t.origin_flag,t.cin, ... ,t.pm_amount,null,now(),'system');
FOR t1 IN
SELECT *
FROM ccdb_stg.o_payment_head_dtls_stg h
WHERE h.mbc_receipt_id = t.receipt_id
LOOP
INSERT INTO ccdb_dummy.payment_head_dtls(payment_id,mbc_receipt_id,charge_head_code,amount,tariff_id,creation_dt,created_by)
VALUES (t.payment_no,t1.mbc_receipt_id,t1.charge_head_code,t1.amount,t1.tariff_id,now(),'system');
END LOOP;
END LOOP;
END
$func$ LANGUAGE plpgsql;
-
使用
FOR LOOP
,而不是冗余的显式游标加上冗余计数和循环。 更简单,更快速。请阅读手册中的查询结果循环一章。Use the implicit cursor of a
FOR LOOP
instead of unwieldy explicit cursors coupled with redundant counts and loops. Much simpler and faster. Read the chapter "Looping Through Query Results" in the manual.LEFT JOIN
至ccdb_dummy.consumers $ c
LEFT JOIN
toccdb_dummy.consumers
in the first SELECT instead of running a separate select for every row.还包括
nextval('ccdb_stg.payments_seq'),而不是对每一行执行单独的选择。 )AS payment_no
在第一个SELECT。Also include
nextval('ccdb_stg.payments_seq') AS payment_no
in the first SELECT. cheaper than lots of separate queries.细节:plpgsql中的赋值运算符
:=
=
。 此处的详情。Minor detail: assignment operator in plpgsql is
:=
, not=
. Details here.但这还远远不够完美。考虑使用基于集合的操作的全新方法,而不是循环中的单个插入。更清洁和更快,但。
But that's far from perfect, still. Consider a completely new approach with set-based operations instead of individual inserts in loops. Much cleaner and faster, yet. That's how modern RDBMS operate best.
包含在SQL函数中以成为替换。
数据修改CTE 需要Postgres 9.1 或更高版本。Wrapped into an SQL function to be a drop-in replacement.
Data-modifying CTEs require Postgres 9.1 or later.CREATE OR REPLACE FUNCTION ccdb_dummy.o_payments2(integer) RETURNS void AS $func$ WITH ins1 AS ( INSERT INTO ccdb_dummy.payments( payment_id, cin, receipt_id, ... , pm_amount, ref_transaction_id,creation_dt,created_by) SELECT nextval('ccdb_stg.payments_seq'),c.cin,p.receipt_id, ... , p.pm_amount, null, now(), 'system' FROM ccdb_stg.o_payments_stg p LEFT JOIN ccdb_dummy.consumers c USING (consumer_num) WHERE p.section_code = $1 RETURNING payment_id, receipt_id ) INSERT INTO ccdb_dummy.payment_head_dtls( payment_id, mbc_receipt_id, charge_head_code, amount, tariff_id,creation_dt,created_by) SELECT i.payment_id,h.mbc_receipt_id,h.charge_head_code,h.amount,h.tariff_id,now(), 'system' FROM ins1 i JOIN ccdb_stg.o_payment_head_dtls_stg h ON h.mbc_receipt_id = i.receipt_id; $func$ LANGUAGE sql;
应该与上述plpgsql函数完全相同(禁止翻译错误)。只需更简单更快速。
Should do the same as the above plpgsql function exactly (barring errors in translation). Just much simpler and faster.
这篇关于如何在PostgreSQL中使用游标来提高函数的性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!