ArangoDB:获取与所选节点有任何关系的每个节点 [英] ArangoDB: Get every node, which is in any way related to a selected node

查看:39
本文介绍了ArangoDB:获取与所选节点有任何关系的每个节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 ArangoDB 中有一个简单的节点链接图.如何从 1 个预选节点遍历并返回与其相关的所有节点?

例如:A→B、B→C、C→D、C→E、F→B、F→E

选择其中任何一个都应该返回相同的结果(所有这些).

我对 ArangoDB 很陌生.

解决方案

您需要的是

//在出站方向跟随边(链接"集合),从 A 开始FOR v IN OUTBOUND节点/A"链接//返回我们看到的每个顶点的键(节点名称)返回 v._key

这只会返回[ "B" ],因为遍历深度隐式为1..1 (min=1, max=1).如果我们增加最大深度,那么我们也可以包括间接连接的节点:

FOR v IN 1..10 出站节点/A"链接返回 v._key

这将给我们 [ "B", "C", "D", "E"].如果我们看一下图,这是正确的:我们只沿着从我们来自的顶点指向另一个顶点(箭头的方向)的边.反过来,我们可以使用 INBOUND,但在你的情况下,我们想忽略边缘的方向并继续跟随:

FOR v IN 1..10 任何节点/A"链接返回 v._key

刚开始的结果可能有点出人意料:
[ "B", "C", "D", "E", "F", "B", "F", "E", "C", "D", "B" ]

我们看到返回了重复的节点.原因是例如从 A 到 C 有多条路径(通过 B 和 B-F-E),并且查询返回每个路径的最后一个节点作为变量 v.(它实际上并不处理所有可能的路径,最大深度为 10,但您可以设置遍历选项 OPTIONS {uniqueEdges: "none"} 来执行此操作.)

它可以帮助返回格式化的遍历路径以更好地了解正在发生的事情(即如何到达节点):

FOR v, e, p IN 1..10 任何节点/A"链接选项 {uniqueEdges: "path"}RETURN CONCAT_SEPARATOR(" - ", p.vertices[*]._key)

结果:

<预><代码>["A-B","A - B - C",A B C D","A - B - C - E","A - B - C - E - F","A - B - C - E - F - B","A - B - F","A - B - F - E","A - B - F - E - C",A - B - F - E - C - D",A - B - F - E - C - B"]

图中存在循环,但不能无限循环,因为10跳后超过了最大深度.但是正如你在上面看到的,它甚至没有达到 10 的深度,而是停止,因为(默认)选项是不沿着每条路径两次边缘(uniqueEdges: "path").

无论如何,这不是想要的结果.一个便宜的技巧是使用 RETURN DISTINCTCOLLECT 或类似的东西来删除重复项.但是我们最好调整遍历选项,以免不必要地跟随边缘.

uniqueEdges: "global" 仍然会包含 B 节点两次,但 uniqueVertices: "global" 给出了想要的结果.此外,在这种情况下可以使用 bfs: true 进行广度优先搜索.不同之处在于到 F 节点的路径更短(A-B-F 而不是 A-B-C-E-F).一般来说,您应该使用的确切选项在很大程度上取决于数据集和您的问题.

还有一个问题需要解决:遍历不包括起始顶点(除了在 p.vertices[0] 中的每条路径).通过将最小深度设置为 0,使用 ArangoDB 3.0 或更高版本可以轻松解决此问题:

FOR v IN 0..10 任何节点/A"链接选项{uniqueVertices:global"}返回 v._key

[ "A", "B", "C", "D", "E", "F" ]

为了验证从 A 到 F 的所有节点都被返回,无论起始顶点如何,我们可以发出以下测试查询:

FOR doc IN 节点返回 (FOR v IN 0..10 ANY doc links OPTIONS {uniqueVertices: "global"}SORT v._key返回 v._key)

所有子数组应该看起来都一样.如果您希望以遍历顺序返回节点名称,请删除 SORT 操作.希望这会有所帮助 =)

I have a simple node-links graph in ArangoDB. How can I traverse from 1 preselected node and return all nodes which are related to it?

For example: A→B, B→C, C→D, C→E, F→B, F→E

Selecting any of them should return the same result (all of them).

I am very new to ArangoDB.

解决方案

What you need is AQL graph traversal, available since ArangoDB 2.8. Older versions provided a set of graph-related functions, but native AQL traversal is faster, more flexible and the graph functions are no longer available starting with 3.0.


AQL traversal let's you follow edges connected to a start vertex, up to a variable depth. Each encountered vertex can be accessed, e.g. for filtering or to construct a result, as well as the edge that led you to this vertex and the full path from start to finish including both, vertices and edges.

In your case, only the names of the visited vertices need to be returned. You can run the following AQL queries, assuming there's a document collection node and an edge collection links and they contain the data for this graph:

// follow edges ("links" collection) in outbound direction, starting at A
FOR v IN OUTBOUND "node/A" links
    // return the key (node name) for every vertex we see
    RETURN v._key

This will only return [ "B" ], because the traversal depth is implicitly 1..1 (min=1, max=1). If we increase the max depth, then we can include nodes that are indirectly connected as well:

FOR v IN 1..10 OUTBOUND "node/A" links
    RETURN v._key

This will give us [ "B", "C", "D", "E"]. If we look at the graph, this is correct: we only follow edges that point from the vertex we come from to another vertex (direction of the arrow). To do the reverse, we could use INBOUND, but in your case, we want to ignore the direction of the edge and follow anyway:

FOR v IN 1..10 ANY "node/A" links
    RETURN v._key

The result might be a bit surprising at first:
[ "B", "C", "D", "E", "F", "B", "F", "E", "C", "D", "B" ]

We see duplicate nodes returned. The reason is that there are multiple paths from A to C for instance (via B and also via B-F-E), and the query returns the last node of every path as variable v. (It doesn't actually process all possible paths up to the maximum depth of 10, but you could set the traversal option OPTIONS {uniqueEdges: "none"} to do so.)

It can help to return formatted traversal paths to better understand what is going on (i.e. how nodes are reached):

FOR v, e, p IN 1..10 ANY "node/A" links OPTIONS {uniqueEdges: "path"}
    RETURN CONCAT_SEPARATOR(" - ", p.vertices[*]._key)

Result:

[
  "A - B",
  "A - B - C",
  "A - B - C - D",
  "A - B - C - E",
  "A - B - C - E - F",
  "A - B - C - E - F - B",
  "A - B - F",
  "A - B - F - E",
  "A - B - F - E - C",
  "A - B - F - E - C - D",
  "A - B - F - E - C - B"
]

There is a cycle in the graph, but there can't be an infinite loop because the maximum depth is exceeded after 10 hops. But as you can see above, it doesn't even reach the depth of 10, it rather stops because the (default) option is to not follow edges twice per path (uniqueEdges: "path").

Anyway, this is not the desired result. A cheap trick would be to use RETURN DISTINCT, COLLECT or something like that to remove duplicates. But we are better off tweaking the traversal options, to not follow edges unnecessarily.

uniqueEdges: "global" would still include the B node twice, but uniqueVertices: "global" gives the desired result. In addition, bfs: true for breadth-first search can be used in this case. The difference is that the path to the F node is shorter (A-B-F instead of A-B-C-E-F). In general, the exact options you should use largely depend on the dataset and the questions you have.

There's one more problem to solve: the traversal does not include the start vertex (other than in p.vertices[0] for every path). This can easily be solved using ArangoDB 3.0 or later by setting the minimum depth to 0:

FOR v IN 0..10 ANY "node/A" links OPTIONS {uniqueVertices: "global"}
    RETURN v._key

[ "A", "B", "C", "D", "E", "F" ]

To verify that all nodes from A through F are returned, regardless of the start vertex, we can issue the following test query:

FOR doc IN node
    RETURN (
        FOR v IN 0..10 ANY doc links OPTIONS {uniqueVertices: "global"}
            SORT v._key
            RETURN v._key
    )

All sub-arrays should look the same. Remove the SORT operation if you want the node names returned in traversal order. Hope this helps =)

这篇关于ArangoDB:获取与所选节点有任何关系的每个节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆