如何在PostgreSQL中将2d数组快速嵌套到1d数组中? [英] How to unnest a 2d array into a 1d array quickly in PostgreSQL?
问题描述
我有一个非常大的数组,可以使用Apache Madlib计算得出,我想对该2d数组中的每个单个数组应用一个运算。
I have a really large array that have I computed with Apache Madlib and I would like to apply an operation to each single array in that 2d array.
我有在此相关答案中找到了可以帮助我取消嵌套的代码。但是,在这个非常大的2d数组(150,000 + 1d float数组)上,代码的速度非常慢。尽管 unnest()
只需要几秒钟即可运行,即使等待几分钟后代码仍未完成。
I have found code that can help me unnest it from this related answer. However, the code is miserably slow on this really large 2d array (150,000+ 1d float arrays). While unnest()
only takes a few seconds to run, even after waiting for several minutes the code has not completed.
当然,必须有一种更快的方法将大型2d阵列嵌套成较小的1d阵列吗?如果该解决方案使用Apache Madlib,则有加分。我确实在文档 deconstruct_2d_array
中发现了一根引线,但是,当我尝试在矩阵上调用该函数时,它失败并出现以下错误:
Surely, there must be a faster way to unnest the large 2d array into smaller 1d arrays? Bonus point if that solution uses Apache Madlib. I did find one lead buried in the documentation called deconstruct_2d_array
, however, when I try to call that function on the matrix, it fails with the following error:
错误:函数 deconstruct_2d_array(double precision []):无效的
类型转换。内部复合类型比
后端复合类型具有更多的元素。
ERROR: Function "deconstruct_2d_array(double precision[])": Invalid type conversion. Internal composite type has more elements than backend composite type.
推荐答案
您在我的旧答案中发现,不适用于大型阵列。我从没想过数组的大小,它应该应该是一个集合(一个表)。
The function you found in my old answer does not scale well for big arrays. I never thought of arrays your size, which should probably be a set (a table) instead.
plpgsql函数可能会替换<一个href = https://stackoverflow.com/questions/8137112/unnest-array-by-one-level>引用的答案。需要Postgres 9.1或更高版本。
Be that as it may, this plpgsql function replaces the one in the referenced answer. Requires Postgres 9.1 or later.
CREATE OR REPLACE FUNCTION unnest_2d_1d(ANYARRAY, OUT a ANYARRAY)
RETURNS SETOF ANYARRAY AS
$func$
BEGIN
FOREACH a SLICE 1 IN ARRAY $1 LOOP
RETURN NEXT;
END LOOP;
END
$func$ LANGUAGE plpgsql IMMUTABLE STRICT;
在Postgres 9.6中的大型2d阵列上进行测试时,速度提高了40倍。
40x faster in my test on a big 2d-array in Postgres 9.6.
STRICT
避免NULL输入出现异常(如由IamIC评论):
STRICT
to avoid an exception for NULL input (as commented by IamIC):
错误:FOREACH表达式不能为空
ERROR: FOREACH expression must not be null
这篇关于如何在PostgreSQL中将2d数组快速嵌套到1d数组中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!