PostgreSQL 将数据从递归 CTE 传递到函数 [英] PostgreSQL pass data from recursive CTE onto function

查看:11
本文介绍了PostgreSQL 将数据从递归 CTE 传递到函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题:我试图发现从源节点 (node_s) 到目标节点 (node_t) 的所有可能路径.

I have the following problem: I am trying to discover all possible paths from source node (node_s) to target node (node_t).

带图边的原表格式很简单:|节点_x |节点_y |强度 | ,其中node_x"->node_y"是直接边,边的强度为权重".

The format of the original table with graph edges is simple: | node_x | node_y | strength | , where "node_x" -> "node_y" is a direct edge with strength of the edge being "weight".

这个想法是,如果在探索路径的任何时候我们发现其子节点中的一个节点有目标node_t,我们记录这条路径并停止探索从此节点开始的路径,否则继续探索.

The idea is if at any point during the exploration of the paths we discover that a node among its children has target node_t, we record this path and stop exploring paths from this node, otherwise continue exploration.

简单的解决方案是使用 PostgreSQL 的递归 CTE,它构建图的传递闭包:

The simple solution was to use PostgreSQL's recursive CTE which constructs the transitive closure of the graph:

  WITH RECURSIVE transitive_closure (source, target, weight, path_string) AS
  ( 
   --non-recurive term
   SELECT b.node_x, b.node_y, b.strength, b.node_x || '.' || b.node_y || '.' AS path_string
   FROM best_path b
   WHERE b.node_x = node_s --source_node

   UNION

   --recurive term
   SELECT tc.source, b.node_y, least(tc.weight,b.strength), tc.path_string || b.node_y || '.' AS path_string
   FROM best_path AS b JOIN transitive_closure AS tc ON b.node_x = tc.target
   WHERE tc.path_string NOT LIKE '%' || b.node_y || '.%'
  )
  SELECT * 
  FROM transitive_closure tc
  WHERE tc.target = node_t --target_node

上面的代码将从源节点node_s发现所有可能的路径.只有在传递闭包构造之后,我才能选择从源节点到目标节点所需的路径行(参见最后的 SELECT 语句).

The code above will discover all possible paths from source node node_s. Only after transitive closure construction, I am able to select the needed rows of paths from source node to target node (see last SELECT statement).

示例:

best_path 表有以下数据:

best_path table has the following data:

 node_x | node_y | strength 
--------+--------+----------
      1 |      2 |        1
      1 |      3 |        1
      2 |      4 |        1
      2 |      5 |        1
      3 |      6 |        1
      3 |      7 |        1
      4 |      8 |        1
      4 |      9 |        1
      5 |     10 |        1
      5 |     11 |        1

查询:

找到从源节点 = 1 到目标节点 = 4 的路径

find the paths from source node = 1, to target node = 4

结果:

 source | target | strength | path_string
--------+--------+----------+------------
      1 |      2 |        1 | 1.2.
      1 |      3 |        1 | 1.3.
      1 |      4 |        1 | 1.2.4.
      1 |      5 |        1 | 1.2.5.
      1 |      6 |        1 | 1.3.6.
      1 |      7 |        1 | 1.3.7.
      1 |      8 |        1 | 1.2.4.8.
      1 |      9 |        1 | 1.2.4.9.
      1 |     10 |        1 | 1.2.5.10.
      1 |     11 |        1 | 1.2.5.11.

这不是我需要的.由于从节点 2 到节点 4(目标)已经有一条直接边,我不需要路径 1.2.5., 1.2.4.8., 1.2.4.9.,1.2.5.10.,1.2.5.11.,路径探索对于节点 2,应该在发现从 2 到 4 的路径时停止.

This is not what I need. Since there is a direct edge from node 2 to node 4 (target) already, I do not need paths 1.2.5., 1.2.4.8., 1.2.4.9.,1.2.5.10.,1.2.5.11., the path exploration for node 2 should stop at the point when the path from 2 to 4 was discovered.

总而言之,如果节点已经有到目标节点的直接边,我不想发现它的路径.这意味着在 CTE 的递归术语中,我想有一些条件会说明以下内容,伪代码如下:

To sum up, I do not want to discover the paths of the node if it already has a direct edge to the target node. This means that in a recursive term of CTE, I would like to have some condition that will say the following, pseudocode follows:

  WITH RECURSIVE transitive_closure (source, target, weight, path_string) AS
  ( 
   --non-recurive term (as before)
   SELECT b.node_x, b.node_y, b.strength, b.node_x || '.' || b.node_y || '.' AS path_string
   FROM best_path b
   WHERE b.node_x = node_s --source_node

   UNION

   --recurive term
   SELECT tc.source, b.node_y, least(tc.weight,b.strength), tc.path_string || b.node_y || '.' AS path_string
   FROM best_path AS b JOIN transitive_closure AS tc ON b.node_x = tc.target
   WHERE tc.path_string NOT LIKE '%' || b.node_y || '.%'
     AND b.node_y = node_t 
     --will select only rows with direct edge to target

   UNION (second union is not allowed in CTE)

   SELECT those rows which do not have direct edge to target 
   AND which parents did not contribute to constructing the query above. 
   i.e. b.node_x = tc.target where there is no direct edge between b.node_x to node_t
  )

作为查找从源节点 = 1 到目标节点 = 4 的路径的查询的结果,我想要以下内容:

As a result of a query to find the paths from source node = 1, to target node = 4, I would like to have the following:

 source | target | strength | path_string
--------+--------+----------+------------
      1 |      2 |        1 | 1.2.
      1 |      3 |        1 | 1.3.
      1 |      4 |        1 | 1.2.4.
      1 |      6 |        1 | 1.3.6.
      1 |      7 |        1 | 1.3.7.

预先感谢您的帮助!

我已经尝试了很多方法:例如FROM/WHERE 子句中的条件,尝试将 CTE 传递给函数,但没有成功.

I have tried many ways already: e.g. conditions in FROM/WHERE clauses, trying to pass CTE into function, but no success.

任何建议将不胜感激.

我有自己的递归函数来实现我想要的,但是,在大量数据上它非常慢;并且 PostgreSQL 的 CTE 显然优化得很好,所以我想深入研究一下.

I have my own recursive function that achieves what I want, however, it is very slow on enormous amount of data; and PostgreSQL's CTE apparently is well-optimized, so I would like to dig into it a bit more.

推荐答案

如果从底部开始,可以提高路径搜索的效率.从孩子开始.如果从父级开始,则需要遍历所有子级;而如果你从子节点搜索,它只有一个父节点,因此不会浪费时间寻找源和目标之间的路径.

You can make the searching of the path more efficient if you start at the bottom. Start at the children. If you start at the parent, it entails traversing all the children; whereas if you searched from the child, it has only one parent, hence won't waste time finding the path between source and target.

with recursive find_parent(source, target, recentness) as
(
    select source, target, 0 
    from tbl
    where target = 9

    union all

    select i.source, i.target, fp.recentness + 1
    from tbl i
    join find_parent fp on i.target = fp.source
),
construct_path(source, target, recentness, path) as
(
  select source, target, recentness, source || '.' || target
  from find_parent 
  where recentness = (select max(recentness) from find_parent)

  union

  select dd.source, dd.target, dd.recentness, cp.path || '.' || dd.target
  from find_parent dd
  join construct_path cp on dd.recentness = cp.recentness - 1  
)
select source, target, path 
from construct_path
order by recentness desc

输出:

SOURCE   TARGET   PATH
1        2        1.2
2        4        1.2.4
4        9        1.2.4.9

现场测试:http://www.sqlfiddle.com/#!1/13e6b/1

与此类似:如何在 SQL SERVER 2005 中为父级提供子级

这是优化的,如果它已经找到了特定的(来源),则将递归减少到父级.

This is optimized, cut recursion to parent if it already find the specific one(source).

来源 = 2

目标 = 9

with recursive find_parent(source, target, recentness) as
(
    select source, target, 0 
    from tbl
    where target = 9

    union all

    select i.source, i.target, fp.recentness + 1
    from tbl i
    join find_parent fp on i.target = fp.source 
         -- despite the name, this target is another one's source
         and i.target <> 2
)
,construct_path(source, target, recentness, path) as
(
    select source, target, recentness, source || '.' || target
    from find_parent 
    where recentness = (select max(recentness) from find_parent)

    union

    select dd.source, dd.target, dd.recentness, cp.path || '.' || dd.target
    from find_parent dd
    join construct_path cp on dd.recentness = cp.recentness - 1  

)
select source, target, path
from construct_path
order by recentness desc

输出:

SOURCE   TARGET  PATH
2        4       2.4
4        9       2.4.9

现场测试:http://www.sqlfiddle.com/#!1/13e6b/16

这篇关于PostgreSQL 将数据从递归 CTE 传递到函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆