Postgres的:如何有效地使用数组作为存储过程的参数? [英] Postgres: How to use arrays as stored procedure parameters efficiently?

查看:1846
本文介绍了Postgres的:如何有效地使用数组作为存储过程的参数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要创建一个Postgres 9.1 PL / pgSQL的存储过程,其他的参数中,取一个值序列,在我的数据库列的一个直接的参考价值。据我所知道的,规范的方法在Postgres里做,这是一个阵列

这是一个相当基本的任务,当然。我的问题是可扩展性:我的code基本工作原理,但一旦通过序列得到较大(如价值几百或上千)表现不好:

使用数组形式我的存储过程中即使是很简单的SELECT语句

  SELECT<一些列>
来自以下;一些表>
其中,<其他一些选择标准>
和<由数组参数&GT选定值的列;
         IN(SELECT * FROM UNNEST(小于阵列参数>))

需要几秒钟来执行,即使数据库不是很大但并有只有几十值的数组

我的第一个怀疑的是, UNNEST(...)的问题,而只是从数组参数引用的列的表中选择是非常快:

  SELECT<一些列>
来自以下;在阵列参数&GT列ref'd表;
其中,<由数组参数&GT选定值的列;
         IN(SELECT * FROM UNNEST(小于阵列参数>))

只需要几毫秒。

我的提问


  1. 有没有使用数组作为参数?替代

  2. 我怎样才能让我的查询有更好的表现?


解决方案

  

我怎样才能让我的查询有更好的表现?


我希望更快的性能,如果你重写查询

  SELECT<一些列>
来自以下;一些表>
其中,<其他一些选择标准>
和<由数组参数&GT选定值的列;
         IN(SELECT * FROM UNNEST(小于阵列参数>));

  SELECT<一些列>
FROM(SELECT UNNEST(小于阵列参数>)作为参数)x
JOIN<过滤表>针对<过滤柱> = x.param
JOIN<其它表>针对<加入标准>
其中,<其他一些选择标准&gt ;;

这听起来像查询规划选择次优计划,误判您的其他 WHERE 标准成本相比,IN子句。通过将其转化为明确的加入子句,你应该获得更好的查询计划。

一般情况下,加入取值往往比大更快 PostgreSQL中的条款。



  

有没有使用数组作为参数的替代方法?


是的。结果
你可以创建临时表,填写并运行一个查询加入反对。

  CREATE TABLE TEMP X(ID INT);INSERT INTO X值
(1),(2),(17),(18);SELECT<一些列>
从X
JOIN<过滤表>针对<过滤柱> = x.id
JOIN<其它表>针对<加入标准>
其中,<其他一些选择标准&gt ;;

或者,但速度更快,使用 CTE 为同样的目的:

 以X(ID)AS(
    VALUES(1 :: INT),(2),(17),(18) - 第一元件上型铸造是足够
    )
SELECT<一些列>
从X
JOIN<过滤表>针对<过滤柱> = x.id
JOIN<其它表>针对<加入标准>
其中,<其他一些选择标准&gt ;;

只要你想使用的功能,一个数组参数,嵌套的内部会是我的选择,太多。你也可以使用CTE在我的最后一个例子函数内,只需用UNNEST(ARR),而不是VALUES子句。

I need to create a Postgres 9.1 PL/pgSQL stored procedure that, among other parameters, takes a sequence of values that directly reference values in one of my database columns. As far as I can tell, the canonical way to do this in Postgres is an array.

This is a rather basic task, of course. My problem is scalability: My code basically works, but performs badly once the sequences passed in get large (as in a few hundreds or thousands of values):

Even rather simple SELECT statements within my stored procedure using the array in the form

SELECT <some columns>
FROM   <some tables>
WHERE  <some other select criteria>
AND    <column with values selected by array parameter>
         IN (SELECT * FROM unnest(<array parameter>))

take several seconds to execute even though the database is not very large yet and there are only tens of values in the array.

My first suspicion was that unnest(...) is the problem, but selecting only from the table with the column referenced in the array parameter is really fast:

SELECT <some columns>
FROM   <table with column ref'd in array parameter>
WHERE  <column with values selected by array parameter>
         IN (SELECT * FROM unnest(<array parameter>))

only takes a few milliseconds.

My questions:

  1. Is there an alternative to using an array as parameter ?
  2. How can I make my queries perform better ?

解决方案

How can I make my queries perform better ?

I would expect faster performance if you rewrite your query

SELECT <some columns>
FROM   <some tables>
WHERE  <some other select criteria>
AND    <column with values selected by array parameter>
         IN (SELECT * FROM unnest(<array parameter>));

to:

SELECT <some columns>
FROM   (SELECT unnest(<array parameter>) AS param) x
JOIN   <filtered table>  ON <filter column> = x.param
JOIN   <other table> ON <join criteria>
WHERE  <some other select criteria>;

It sounds like the query planner chooses a suboptimal plan, misjudging the cost of your other WHERE criteria in comparison to the IN clause. By transforming it to an explicit JOIN clause you should get a better query plan.

Generally, JOINs tend to be faster than large IN clauses in PostgreSQL.


Is there an alternative to using an array as parameter ?

Yes.
You could create temporary table, fill it and run a query joining against it.

CREATE TEMP TABLE x(id int);

INSERT INTO x VALUES
(1), (2), (17), (18);

SELECT <some columns>
FROM   x
JOIN   <filtered table>  ON <filter column> = x.id
JOIN   <other table> ON <join criteria>
WHERE  <some other select criteria>;

Or, faster yet, use a CTE for the same purpose:

WITH x(id) AS (
    VALUES (1::int), (2), (17), (18) -- type-cast on first element is enough
    )
SELECT <some columns>
FROM   x
JOIN   <filtered table>  ON <filter column> = x.id
JOIN   <other table> ON <join criteria>
WHERE  <some other select criteria>;

As long as you want to use a function, an array parameter, unnested inside would be my choice, too. You could also use the CTE in my last example inside a function, just with unnest(arr) instead of a VALUES clause.

这篇关于Postgres的:如何有效地使用数组作为存储过程的参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆