使用每个表的一列比较两个表函数的结果 [英] Compare result of two table functions using one column from each
问题描述
按照说明此处,我创建了两个使用 EXECUTE FORMAT $ c的函数$ c>并返回
(int,smallint)
的同一表。
According the instructions here I have created two functions that use EXECUTE FORMAT
and return the same table of (int,smallint)
.
示例定义:
CREATE OR REPLACE FUNCTION function1(IN _tbl regclass, IN _tbl2 regclass,
IN field1 integer)
RETURNS TABLE(id integer, dist smallint)
CREATE OR REPLACE FUNCTION function2(IN _tbl regclass, IN _tbl2 regclass,
IN field1 integer)
RETURNS TABLE(id integer, dist smallint)
两个函数都返回完全相同的行数。示例结果(将始终按dist排序):
Both functions return the exact same number of rows. Sample result (will be always ordered by dist):
(49,0)
(206022,3)
(206041,3)
(92233,4)
有没有一种方法可以比较同一行的两个函数之间第二个字段的值,以确保两个结果都相同:
Is there a way to compare values of the second field between the two functions for the same rows, to ensure that both results are the same:
例如:
SELECT
function1('tblp1','tblp2',49),function2('tblp1_v2','tblp2_v2',49)
返回以下内容:
(49,0) (49,0)
(206022,3) (206022,3)
(206041,3) (206041,3)
(92233,4) (133,4)
尽管我不期望得到相同的结果(每个函数都是一个 topK 查询,我具有任意打破的纽带关系/为了更快的性能对第二个函数进行了一些优化)我可以确保两个函数都返回正确的结果,如果每行第二个数字结果是他一样。在上面的示例中,我可以确保得到正确的结果,因为:
Although I am not expecting identical results (each function is a topK query and I have ties which are broken arbitrarily / with some optimizations in the second function for faster performance) I can ensure that both functions return correct results, if for each row the second numbers in the results are the same. In the example above, I can ensure I get correct results, because:
1st row 0 = 0,
2nd row 3 = 3,
3rd row 3 = 3,
4th row 4 = 4
尽管事实是,对于第4行, 92233!= 133
despite the fact that for the 4th row, 92233!=133
是否有一种获取方法仅每个功能结果的第二个字段,以批量比较它们,例如
Is there a way to get only the 2nd field of each function result, to batch compare them e.g. with something like:
SELECT COUNT(*)
FROM
(SELECT
function1('tblp1','tblp2',49).field2,
function2('tblp1_v2','tblp2_v2',49).field2 ) n2
WHERE function1('tblp1','tblp2',49).field2 != function1('tblp1','tblp2',49).field2;
我正在使用PostgreSQL 9.3。
I am using PostgreSQL 9.3.
推荐答案
有没有办法只获取每个函数结果的第二个字段,以批量比较它们?
Is there a way to get only the 2nd field of each function result, to batch compare them?
以下所有答案均假定行按 匹配 顺序返回。
All of the following answers assume that rows are returned in matching order.
具有SRF函数爆炸行的古怪功能,可并行返回相同行数:
With the quirky feature of exploding rows from SRF functions returning the same number of rows in parallel:
SELECT count(*) AS mismatches
FROM (
SELECT function1('tblp1','tblp2',49) AS f1
, function2('tblp1_v2','tblp2_v2',49) AS f2
) sub
WHERE (f1).dist <> (f2).dist; -- note the parentheses!
为消除可能的表引用的歧义,必须在行类型周围加上括号。 手册中的详细信息。
The parentheses around the row type are necessary to disambiguate from a possible table reference. Details in the manual here.
如果返回的行数不相同(这将完全为您破坏),则默认为行的笛卡尔乘积。
This defaults to Cartesian product of rows if the number of returned rows is not the same (which would break it completely for you).
您可以使用 WITH ORDINALITY
即时生成行号,而不必依赖于将SRF函数的结果配对 SELECT
列表:
You can use WITH ORDINALITY
to generate a row number o the fly and don't need to depend on pairing the result of SRF functions in the SELECT
list:
SELECT count(*) AS mismatches
FROM function1('tblp1','tblp2',49) WITH ORDINALITY AS f1(id,dist,rn)
FULL JOIN function2('tblp1_v2','tblp2_v2',49) WITH ORDINALITY AS f2(id,dist,rn) USING (rn)
WHERE f1.dist IS DISTINCT FROM f2.dist;
这适用于每个函数中相同数量的行以及不同的行数(将被计数
This works for the same number of rows from each function as well as differing numbers (which would be counted as mismatch).
相关:
- PostgreSQL unnest() with element number
SELECT count(*) AS mismatches
FROM ROWS FROM (function1('tblp1','tblp2',49)
, function2('tblp1_v2','tblp2_v2',49)) t(id1, dist1, id2, dist2)
WHERE t.dist1 IS DISTINCT FROM t.dist2;
相关答案:
- Is it possible to answer queries on a view before fully materializing the view?
另外:
执行格式
不是设置的plpgsql功能。 返回查询
是。 format()
只是用于构建查询字符串的便捷函数,可以在SQL或plpgsql中的任何位置使用。
Aside:
EXECUTE FORMAT
is not a set plpgsql functionality. RETURN QUERY
is. format()
is just a convenient function for building a query string, can be used anywhere in SQL or plpgsql.
这篇关于使用每个表的一列比较两个表函数的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!