检查plpgsql中是否存在变量的最佳实践? [英] Best practice to check if a variable exists in plpgsql?

查看:102
本文介绍了检查plpgsql中是否存在变量的最佳实践?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,我有一个存储过程,用于从csv文件导入数据并将读取的数据写入SQL表. 我有一个定义如下的表:

For instance I've got a stored procedure to import data from csv files and write the read data into a SQL table. I've got a table defined as below:

CREATE TABLE person (id int, name text, age int, married boolean); 

首先,我检查记录是否已经存在,如果存在,我会更新,如果不存在,请插入. 每个记录字段可能具有不同的类型,因此SQL命令的结果将分配给一个标量变量列表:

First I check if the record exist already, if exists I do update, if doesn't - insert. Each field of record may have a different type, so the result of a SQL command are assigned to a list of scalar variables:

SELECT name, age, married INTO v_name, v_age, v_married [..]

让我们假设每一列都被声明为可选(允许NULL).那么检查哪个变量(v_name,v_age,v_married)不是NULL并且可以处理的最佳方法是什么?

Let's assume every column is declared as optional (NULL allowed). What's the best way then to check which variable (v_name, v_age, v_married) is not NULL and can be processed?

我找到了许多解决方案:

I've found many solutions:

  • 如果没有找到
  • 何时NO_DATA_FOUND然后
  • 如果v_age不为空,则[...]

或者我现在正在使用的动态解决方案,使用上面提到的最后一种方法,当我必须检查多列(col)时:

or dynamic solution I'm using now using the last way I've mentioned above, when I have to check multiple columns (col):

list_of_columns := ARRAY['name','age','married'];
FOREACH x IN ARRAY list_of_columns LOOP
   EXECUTE 'SELECT ' || x
       || ' FROM person
            WHERE id = ' || quote_literal(v_id)
            INTO y;

   IF  x = 'name' AND (y != v_name OR y IS NULL) THEN
     UPDATE person
     SET    name = v_name
     WHERE  id = v_id;

   ELSIF x = 'age' AND (y != v_age OR y IS NULL) THEN
     UPDATE person
     SET    age = v_age
     WHERE  id = v_id;

   ELSIF x = 'married' AND (y != v_married OR y IS NULL) THEN
     UPDATE person
     SET    married= v_married
     WHERE  id = v_id;
   END IF;
END LOOP;

我正在寻求考虑最佳实践和性能的最佳解决方案. 感谢您的帮助!

I'm looking for the best solutions having regard to the best practice and performance. Any help is appreciated!

推荐答案

我认为,您可以彻底沿以下方面改进整个过程:

I think, you can radically improve the whole procedure along these lines:

BEGIN;

CREATE TEMP TABLE tmp_p ON COMMIT DROP AS
SELECT * FROM person LIMIT 0;

COPY tmp_p FROM '/absolute/path/to/file' FORMAT csv;

UPDATE person p
SET    name    = t.name
      ,age     = t.age
      ,married = t.person
FROM   tmp_p t
WHERE  p.id = t.id
AND   (p.name    IS DISTINCT FROM t.name OR
       p.age     IS DISTINCT FROM t.age  OR
       p.married IS DISTINCT FROM t.married);

INSERT INTO person p(id, name, age, married, ...)
SELECT id, name, age, married, ...
FROM   tmp_p t
WHERE  NOT EXISTS (SELECT 1 FROM person x WHERE x.id = t.id);

COMMIT; -- drops temp table because of ON COMMIT DROP

解释

  • COPY将您的CSV文件复制到具有匹配布局的临时表中.我用CREATE TABLE AS ... LIMIT 0复制了目标表的布局,您可能需要调整...

    Explain

    • COPY your CSV file to a temporary table with matching layout. I copied the layout of the target table with CREATE TABLE AS ... LIMIT 0, you may need to adapt ...

      UPDATE现有行.避免使用WHERE子句的最后3行进行空更新(什么都不会改变).
      如果要跳过UPDATE中的NULL值(真的吗?),请使用COALESCE(t.name, p.name)这样的表达式.在NULL的情况下,该值会返回到现有值.(可能有用,但是实际上在您的代码中没有.)

      UPDATE existing rows. Avoid empty updates (nothing would change) with the last 3 lines in the WHERE clause.
      If you want to skip NULL values in the UPDATE (do you really?), use expressions like COALESCE(t.name, p.name). That falls back to the existing value in the case of NULL. (May be useful, but that wasn't actually in your code.)

      INSERT不存在的行.为此,请使用NOT EXISTS半联接.

      INSERT non-existing rows. Use a NOT EXISTS semi-join for that.

      所有事务都在同一笔交易中,因此在发生问题的过程中,您不会半生不熟.临时表在事务结束时被删除,因为我是用这种方式创建的.

      All in one transaction so you don't end up with a half-baked result in the case of a problem along the way. The temp table is dropped at the end of the transaction because I created it that way.

      这篇关于检查plpgsql中是否存在变量的最佳实践?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆