一个快速的方法来找到连接的组件在1-NN图? [英] A fast way to find connected component in a 1-NN graph?

查看:132
本文介绍了一个快速的方法来找到连接的组件在1-NN图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我得到了A N * N的距离矩阵,每一个点,我计算出其最近的邻居,所以我们有一个N * 2的矩阵,这似乎是的这个的:

  0  - > 1
1  - > 2
2  - > 3
3  - > 2
4  - > 2
5  - > 6
6  - > 7
7  - > 6
8  - > 6
9  - > 8
 

第二列是最近邻居的索引。因此,这是一种特殊的定向 图中,每个顶点曾和只有一个出度。

当然,我们可以先改造N * 2的矩阵,以一个标准的图形重新presentation,并执行BFS / DFS来获得所连接的部件。

不过,鉴于这种特殊图形的特点,是否有任何其他快速的方法来完成这项工作?

我将非常AP preciated。

更新:

我已经实现了一个简单的算法,这种的情况这里

你看,我没有用一个工会找到的算法,因为数据结构可以使事情不是那么容易的,我怀疑这是在我的情况下,以最快的方式(我是说几乎)。

您可能会认为,_merge过程可能是费时,但如果我们换边进入连续的地方,而赋予新的标签,合并可能会花费少,但需要另外N空间追查原始指数。

解决方案

最快的算法寻找连接的组件给定边列表中的工会找到的算法:对于每个节点,将指针向在相同组中的一个节点,与所有边缘会聚到相同的节点,如果发现长度为至少2的路径,向上重新连接底层节点。

这肯定会以线性时间运行:

   - 推所有边缘成工会发现结构:O(N)
 - 存储在集中的每个节点(联合-FIND根)
    并更新组非空集:为O(n)
 - 返回组非空集(图表组件)。
 

由于边缘的列表已经几乎形成一个联合-查找树中,也能够跳过第一步:

 为每个节点
 - 如果该节点没有被标记为收集
 - 步行沿边缘,直到你找到一个令-1或订单-2环,
   收集节点航路
 - 重新连接所有节点的路径的端部,并考虑其根为一组。
 - 存储所有节点的集合的根。
 - 更新组非空集。
 - 标记所有节点的收集。
返回该组非空集
 

第二种算法是线性的为好,但只是一个基准会告诉我们,如果它的实际速度更快。工会-FIND算法的优点在于它的优化。这延迟了优化的第二步骤,但完全消除第一步

您也许可以挤出一点性能,如果你加入工会的步骤与近邻计算,然后收集集在第二遍。

First of all, I got a N*N distance matrix, for each point, I calculated its nearest neighbor, so we had a N*2 matrix, It seems like this:

0 -> 1  
1 -> 2  
2 -> 3  
3 -> 2  
4 -> 2  
5 -> 6  
6 -> 7  
7 -> 6  
8 -> 6  
9 -> 8

the second column was the nearest neighbor's index. So this was a special kind of directed graph, with each vertex had and only had one out-degree.

Of course, we could first transform the N*2 matrix to a standard graph representation, and perform BFS/DFS to get the connected components.

But, given the characteristic of this special graph, is there any other fast way to do the job ?

I will be really appreciated.

Update:

I've implemented a simple algorithm for this case here.

Look, I did not use a union-find algorithm, because the data structure may make things not that easy, and I doubt whether It's the fastest way in my case(I meant practically).

You could argue that the _merge process could be time consuming, but if we swap the edges into the continuous place while assigning new label, the merging may cost little, but it need another N spaces to trace the original indices.

解决方案

The fastest algorithm for finding connected components given an edge list is the union-find algorithm: for each node, hold the pointer to a node in the same set, with all edges converging to the same node, if you find a path of length at least 2, reconnect the bottom node upwards.

This will definitely run in linear time:

- push all edges into a union-find structure: O(n)
- store each node in its set (the union-find root)
    and update the set of non-empty sets: O(n)
- return the set of non-empty sets (graph components).

Since the list of edges already almost forms a union-find tree, it is possible to skip the first step:

for each node
- if the node is not marked as collected
-- walk along the edges until you find an order-1 or order-2 loop, 
   collecting nodes en-route
-- reconnect all nodes to the end of the path and consider it a root for the set.
-- store all nodes in the set for the root.
-- update the set of non-empty sets.
-- mark all nodes as collected.
return the set of non-empty sets

The second algorithm is linear as well, but only a benchmark will tell if it's actually faster. The strength of the union-find algorithm is its optimization. This delays the optimization to the second step but removes the first step completely.

You can probably squeeze out a little more performance if you join the union step with the nearest neighbor calculation, then collect the sets in the second pass.

这篇关于一个快速的方法来找到连接的组件在1-NN图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆