遍历表,对每一行进行计算 [英] Iterate through table, perform calculation on each row

查看:60
本文介绍了遍历表,对每一行进行计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先我想说我对 SQL 非常陌生,但我现在的工作要求我在其中工作.

我有一个包含地形点数据 (x,y,z) 的数据集.我正在尝试基于此数据构建 KNN 模型.对于每个点P",我搜索数据集中最接近 P 的 100 个点(最近的意思是地理上最近的).然后我对这些点的值求平均值(这个平均值被称为残差),并将这个值添加到resid"列中的表格中.

作为概念证明,我试图简单地遍历表,并将每一行中resid"列的值设置为 1.0.

我的查询是这样的:

CREATE OR REPLACE FUNCTION LoopThroughTable() RETURNS VOID AS '声明行表%rowtype;开始SELECT * FROM table LOOP 中的 FOR 行SET row.resid = 1.0;结束循环;结尾'语言'plpgsql';SELECT LoopThroughTable() 作为输出;

此代码执行并成功返回,但是当我检查表时,没有进行任何更改.我的错误是什么?

解决方案

在循环中逐行进行更新几乎总是一个坏主意,并且极其缓慢且无法扩展.你真的应该想办法避免这种情况.

说完之后:

你的函数所做的就是改变内存中列值的值——你只是在修改一个变量的内容.如果你想更新数据,你需要一个 update 语句:

您需要在循环内使用 UPDATE :

创建或替换函数 LoopThroughTable()返回无效作为$$宣布t_row the_table%rowtype;开始FOR t_row in SELECT * FROM the_table LOOP更新 the_table设置残留 = 1.0其中 pk_column = t_row.pk_column;--<<<!!!重要的 !!!结束循环;结尾;$$语言 plpgsql;

请注意,您必须update语句的主键上添加where条件,否则您将更新all 循环的每次迭代的行.

一个稍微更有效的解决方案是使用游标,然后使用where current of

进行更新

创建或替换函数 LoopThroughTable()返回无效作为 $$宣布t_curs 游标用于select * from the_table;t_row the_table%rowtype;开始FOR t_row in t_curs LOOP更新 the_table设置残留 = 1.0其中 t_curs 的电流;结束循环;结尾;$$语言 plpgsql;

<小时><块引用>

因此,如果我在循环完成后执行 UPDATE 查询,是否会将更改提交到表?

没有.对函数的调用在调用事务的上下文中运行.因此,如果您在 SQL 客户端中禁用了自动提交,则需要在运行 SELECT LoopThroughTable()commit.

<小时>

注意语言名称是一个标识符,不要在它周围使用单引号.您还应该避免使用像 row 这样的关键字作为变量名.

使用美元报价(正如我所做的那样)也使编写函数体更容易

I would like to preface this by saying I am VERY new to SQL, but my work now requires that I work in it.

I have a dataset containing topographical point data (x,y,z). I am trying to build a KNN model based on this data. For every point 'P', I search for the 100 points in the data set nearest P (nearest meaning geographically nearest). I then average the values of these points (this average is known as a residual), and add this value to the table in the 'resid' column.

As a proof of concept, I am trying to simply iterate over the table, and set the value of the 'resid' column to 1.0 in every row.

My query is this:

CREATE OR REPLACE FUNCTION LoopThroughTable() RETURNS VOID AS '
DECLARE row table%rowtype;
BEGIN
    FOR row in SELECT * FROM table LOOP
        SET row.resid = 1.0;
    END LOOP;
END

' LANGUAGE 'plpgsql';

SELECT LoopThroughTable() as output; 

This code executes and returns successfully, but when I check the table, no alterations have been made. What is my error?

解决方案

Doing updates row-by-row in a loop is almost always a bad idea and will be extremely slow and won't scale. You should really find a way to avoid that.

After having said that:

All your function is doing is to change the value of the column value in memory - you are just modifying the contents of a variable. If you want to update the data you need an update statement:

You need to use an UPDATE inside the loop:

CREATE OR REPLACE FUNCTION LoopThroughTable() 
  RETURNS VOID 
AS
$$
DECLARE 
   t_row the_table%rowtype;
BEGIN
    FOR t_row in SELECT * FROM the_table LOOP
        update the_table
            set resid = 1.0
        where pk_column = t_row.pk_column; --<<< !!! important !!!
    END LOOP;
END;
$$ 
LANGUAGE plpgsql;

Note that you have to add a where condition on the primary key to the update statement otherwise you would update all rows for each iteration of the loop.

A slightly more efficient solution is to use a cursor, and then do the update using where current of

CREATE OR REPLACE FUNCTION LoopThroughTable() 
  RETURNS VOID 
AS $$
DECLARE 
   t_curs cursor for 
      select * from the_table;
   t_row the_table%rowtype;
BEGIN
    FOR t_row in t_curs LOOP
        update the_table
            set resid = 1.0
        where current of t_curs;
    END LOOP;
END;
$$ 
LANGUAGE plpgsql;


So if I execute the UPDATE query after the loop has finished, will that commit the changes to the table?

No. The call to the function runs in the context of the calling transaction. So you need to commit after running SELECT LoopThroughTable() if you have disabled auto commit in your SQL client.


Note that the language name is an identifier, do not use single quotes around it. You should also avoid using keywords like row as variable names.

Using dollar quoting (as I did) also makes writing the function body easier

这篇关于遍历表,对每一行进行计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆