使用每个表的一列比较两个表函数的结果 [英] Compare result of two table functions using one column from each

查看:361
本文介绍了使用每个表的一列比较两个表函数的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

按照说明此处,我创建了两个使用 EXECUTE FORMAT 并返回(int,smallint)的同一表。

According the instructions here I have created two functions that use EXECUTE FORMAT and return the same table of (int,smallint).

示例定义:

CREATE OR REPLACE FUNCTION function1(IN _tbl regclass, IN _tbl2 regclass, 
IN field1 integer) 
RETURNS TABLE(id integer, dist smallint)

CREATE OR REPLACE FUNCTION function2(IN _tbl regclass, IN _tbl2 regclass, 
IN field1 integer) 
RETURNS TABLE(id integer, dist smallint)

两个函数都返回完全相同的行数。示例结果(将始终按dist排序):

Both functions return the exact same number of rows. Sample result (will be always ordered by dist):

(49,0)
(206022,3)
(206041,3)
(92233,4)

有没有一种方法可以比较同一行的两个函数之间第二个字段的值,以确保两个结果都相同:

Is there a way to compare values of the second field between the two functions for the same rows, to ensure that both results are the same:

例如:

SELECT
function1('tblp1','tblp2',49),function2('tblp1_v2','tblp2_v2',49)

返回以下内容:

(49,0)      (49,0)
(206022,3)  (206022,3)
(206041,3)  (206041,3)
(92233,4)   (133,4)

尽管我不期望得到相同的结果(每个函数都是一个 topK 查询,我具有任意打破的纽带关系/为了更快的性能对第二个函数进行了一些优化)我可以确保两个函数都返回正确的结果,如果每行第二个数字结果是他一样。在上面的示例中,我可以确保得到正确的结果,因为:

Although I am not expecting identical results (each function is a topK query and I have ties which are broken arbitrarily / with some optimizations in the second function for faster performance) I can ensure that both functions return correct results, if for each row the second numbers in the results are the same. In the example above, I can ensure I get correct results, because:

1st row 0 = 0,
2nd row 3 = 3,
3rd row 3 = 3,
4th row 4 = 4

尽管事实是,对于第4行, 92233!= 133

despite the fact that for the 4th row, 92233!=133

是否有一种获取方法仅每个功能结果的第二个字段,以批量比较它们,例如

Is there a way to get only the 2nd field of each function result, to batch compare them e.g. with something like:

SELECT COUNT(*)
FROM 
(SELECT
function1('tblp1','tblp2',49).field2,
function2('tblp1_v2','tblp2_v2',49).field2 ) n2
WHERE  function1('tblp1','tblp2',49).field2 != function1('tblp1','tblp2',49).field2;

我正在使用PostgreSQL 9.3。

I am using PostgreSQL 9.3.

推荐答案


有没有办法只获取每个函数结果的第二个字段,以批量比较它们?

Is there a way to get only the 2nd field of each function result, to batch compare them?

以下所有答案均假定行按 匹配 顺序返回。

All of the following answers assume that rows are returned in matching order.

具有SRF函数爆炸行的古怪功能,可并行返回相同行数:

With the quirky feature of exploding rows from SRF functions returning the same number of rows in parallel:

SELECT count(*) AS mismatches
FROM  (
   SELECT function1('tblp1','tblp2',49) AS f1
        , function2('tblp1_v2','tblp2_v2',49) AS f2
   ) sub
WHERE  (f1).dist <> (f2).dist;  -- note the parentheses!

为消除可能的表引用的歧义,必须在行类型周围加上括号。 手册中的详细信息。

The parentheses around the row type are necessary to disambiguate from a possible table reference. Details in the manual here.

如果返回的行数不相同(这将完全为您破坏),则默认为行的笛卡尔乘积。

This defaults to Cartesian product of rows if the number of returned rows is not the same (which would break it completely for you).

您可以使用 WITH ORDINALITY 即时生成行号,而不必依赖于将SRF函数的结果配对 SELECT 列表:

You can use WITH ORDINALITY to generate a row number o the fly and don't need to depend on pairing the result of SRF functions in the SELECT list:

SELECT count(*) AS mismatches
FROM      function1('tblp1','tblp2',49)       WITH ORDINALITY AS f1(id,dist,rn)
FULL JOIN function2('tblp1_v2','tblp2_v2',49) WITH ORDINALITY AS f2(id,dist,rn) USING (rn)
WHERE  f1.dist IS DISTINCT FROM f2.dist;

这适用于每个函数中相同数量的行以及不同的行数(将被计数

This works for the same number of rows from each function as well as differing numbers (which would be counted as mismatch).

相关:

  • PostgreSQL unnest() with element number
SELECT count(*) AS mismatches
FROM   ROWS FROM (function1('tblp1','tblp2',49)
                , function2('tblp1_v2','tblp2_v2',49)) t(id1, dist1, id2, dist2)
WHERE  t.dist1 IS DISTINCT FROM t.dist2;

相关答案:

  • Is it possible to answer queries on a view before fully materializing the view?

另外:

执行格式不是设置的plpgsql功能。 返回查询是。 format() 只是用于构建查询字符串的便捷函数,可以在SQL或plpgsql中的任何位置使用。

Aside:
EXECUTE FORMAT is not a set plpgsql functionality. RETURN QUERY is. format() is just a convenient function for building a query string, can be used anywhere in SQL or plpgsql.

这篇关于使用每个表的一列比较两个表函数的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆