有向图SQL [英] Directed graph SQL

查看:112
本文介绍了有向图SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据集,这些数据集表示有向图中的节点.

I have the following data set, which represents nodes in a directed graph.

CREATE TABLE nodes (NODE_FROM VARCHAR2(10),
                    NODE_TO VARCHAR2(10));

INSERT INTO nodes VALUES('GT','TG');
INSERT INTO nodes VALUES('GG','GC');
INSERT INTO nodes VALUES('AT','TG');
INSERT INTO nodes VALUES('TG','GC');
INSERT INTO nodes VALUES('GC','CG');
INSERT INTO nodes VALUES('TG','GG');
INSERT INTO nodes VALUES('GC','CA');
INSERT INTO nodes VALUES('CG','GT');

视觉表示: http://esser.hopto.org/temp/image1.JPG

使用此数据集,我希望用户输入一个级别(例如2),这将使所有节点离开特定节点2个跳"):

Using this data set, I want a user to enter a level (e.g. 2) and this returns all nodes 2 "hops" away from a specific node):

NODE_FROM  NODE_TO

TG        GC
TG        GG
AT        TG
GT          TG

http://esser.hopto.org/temp/image2.JPG

我当前的尝试如下:

SELECT node_from, node_to
  FROM nodes
  WHERE level <= 2   -- Display nodes two "hops" from 'AT'
START WITH node_from = 'AT'
CONNECT BY NOCYCLE PRIOR node_to = node_from
    OR    node_to = PRIOR node_from
GROUP BY node_from, node_to;

http://esser.hopto.org/temp/image3.JPG

如您所见,关系:GT-> TG丢失.

As you can see, the relationship: GT -> TG is missing.

推荐答案

所以您的图形如下所示:

So your graph looks like this:

您可以使用Oracle的START WITH/CONNECT BY功能执行所需的操作.如果从节点GA开始,则可以到达图中的所有节点,如下所示.

You can use Oracle's START WITH/CONNECT BY feature to do what you want. If we start at node GA, we can reach all nodes in the graph, as shown below.

CREATE TABLE edges (PARENT VARCHAR(100), CHILD VARCHAR(100));

insert into edges values ('AT', 'TG');
insert into edges values ('CG', 'GT');
insert into edges values ('GA', 'AT');
insert into edges values ('GC', 'CA');
insert into edges values ('GC', 'CG');
insert into edges values ('GG', 'GC');
insert into edges values ('GT', 'TG');
insert into edges values ('TG', 'GA');
insert into edges values ('TG', 'GC');
insert into edges values ('TG', 'GG');
COMMIT;

SELECT *
  FROM edges
START WITH CHILD = 'GA'
CONNECT BY NOCYCLE PRIOR CHILD = PARENT;

输出:

    PARENT  CHILD
1   TG      GA
2   GA      AT
3   AT      TG
4   TG      GC
5   GC      CA
6   GC      CG
7   CG      GT
8   CG      GT
9   GC      CA

注意 由于您的图形具有循环,因此在CONNECT BY上使用NOCYCLE语法很重要,否则将无法正常工作.

NOTE Since your graph has cycles, it's important to use the NOCYCLE syntax on the CONNECT BY, otherwise this won't work.

基于OP最新编辑的答案

首先,我假设"2跳"表示最多2跳",因为您当前的查询使用的是level <= 2.如果恰好需要2跳,则应为level = 2.

First of all, I assume that by "2 hops" you mean "at most 2 hops", because your current query is using level <= 2. If you want exactly 2 hops, it should be level = 2.

在更新的图形(image2.JPG)中,没有从AT到GT的路径需要2跳,因此查询返回的是我期望的值.从AT到GT,我们可以转到AT->TG->GC->CG->GT,但是那是4跳,大于2,所以这就是为什么您没有得到该结果的原因.

In your updated graph (image2.JPG), there is no path from AT to GT that takes 2 hops, so the query is returning what I would expect. From AT to GT, we can go AT->TG->GC->CG->GT, but that's 4 hops, which is greater than 2, so that's why you aren't getting that result back.

如果您希望能够在2跳中到达AT到GT,则需要在TG和GT之间添加一条边,如下所示:

If you are expecting to be able to reach AT to GT in 2 hops, then you need to add an edge between TG and GT, like this:

INSERT INTO nodes VALUES('TG','GT');

现在,当您运行查询时,您将获得以下数据:

Now when you run your query, you'll get this data back:

NODE_FROM NODE_TO TG TG气相色谱 TG GG TG GT

NODE_FROM NODE_TO AT TG TG GC TG GG TG GT

请记住,START WITH/CONNECT BY仅在节点之间存在路径时才起作用.在您的图中(在上面添加新边之前),没有AT->TG->GT的路径,所以这就是为什么您没有将结果取回的原因.

Remember that START WITH/CONNECT BY is going to only work if there is a path between the nodes. In your graph (before I added the new edge above), there is no path for AT->TG->GT, so that's why you're not getting the result back.

现在,如果添加了边缘TG->AT,那么我们将获得路径GT->TG->AT.因此,在这种情况下,AT与GT的距离为2跳(也就是说,我们现在要以相反的方式,从GT开始,到AT为止).但是要找到这些路径,您需要设置START WITH node_from ='GT'.

Now, if you added the edge TG->AT, then we would have the path GT->TG->AT. So in that case AT is 2 hops away from GT (i.e. we're going the reverse way now, starting from GT and ending at AT). But to find those paths, you would need to set START WITH node_from = 'GT'.

如果您的目标是查找从起始节点到任何目标节点的所有路径,且所有目标节点之间的距离为< = 2跳或更短,则上述方法应该有效.

If your goal is to find all paths from a start node to any target node that is level <= 2 hops or less away, then the above should work.

但是,如果您想全部找到从某个目标节点到源节点的所有路径(即,我从GT->TG->AT给出的相反示例),那么在这里是行不通的.您必须对图中的所有节点运行查询.

However, if you want to all find all paths from some target node back to a source node (i.e. the reverse example I gave, from GT->TG->AT), then that's not going to work here. You'd have to run the query for all nodes in the graph.

START WITH/CONNECT BY视为深度优先搜索.它将从起始节点到处都可以到达.但这不会做更多的事情.

Think of START WITH/CONNECT BY as doing a depth first search. It's going to go everywhere it can from a starting node. But it's not going to do any more than that.

摘要:

鉴于上述限制,我认为查询工作正常.我已经解释了为什么不返回GT-TG路径的原因,所以我希望这是有道理的.

I think the query works fine, given the constraints above. I've explained why the GT-TG path is not returned, so I hope that makes sense.

请记住,但是,如果您还尝试遍历反向路径,则必须遍历每个节点并运行查询,每次更改START WITH节点.

Keep in mind, however, if you are trying to traverse reverse paths as well, you'll have to loop over every node and run the query, changing the START WITH node each time.

这篇关于有向图SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆