图中具有1亿个节点的连接组件 [英] Connected components in a graph with 100 million nodes

查看：73 发布时间：2020/11/20 5:38:47 python graph

本文介绍了图中具有1亿个节点的连接组件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在具有1亿个节点的图形中获取连接的组件的列表.对于较小的图，我通常使用 connected_components 函数正是这样做的.但是，使用此模块将具有1亿个节点(及其边缘)的图形加载到内存中将需要大约. 110GB的内存，我没有.一种替代方法是使用具有连接的组件功能的图形数据库，但我在Python中没有找到任何数据库.似乎Dex(API:Java，.NET，C ++)具有此功能，但我不确定100％.理想情况下，我正在寻找Python中的解决方案.非常感谢.

I am trying to get the list of connected components in a graph with 100 million nodes. For smaller graphs, I usually use the connected_components function of the Networkx module in Python which does exactly that. However, loading a graph with 100 million nodes (and their edges) into memory with this module would require ca. 110GB of memory, which I don't have. An alternative would be to use a graph database which has a connected components function but I haven't found any in Python. It would seem that Dex (API: Java, .NET, C++) has this functionality but I'm not 100% sure. Ideally I'm looking for a solution in Python. Many thanks.

推荐答案

SciPy具有稀疏矩阵格式之一来输入图的邻接矩阵.有针对性和无针对性的案例.

SciPy has a connected components algorithm. It expects as input the adjacency matrix of your graph in one of its sparse matrix formats and handles both the directed and undirected cases.

根据(i, j)对adj_list的序列构建稀疏邻接矩阵，其中i和j是节点(从零开始)的索引，可以通过以下方式完成

Building a sparse adjacency matrix from a sequence of (i, j) pairs adj_list where i and j are (zero-based) indices of nodes can be done with

i_indices, j_indices = zip(*adj_list)
adj_matrix = scipy.sparse.coo_matrix((np.ones(number_of_nodes),
                                     (i_indices, j_indices)))

对于非定向案例，您将不得不做一些额外的工作.

You'll have to do some extra work for the undirected case.

如果您的图形足够稀疏，则此方法应该有效.

This approach should be efficient if your graph is sparse enough.

这篇关于图中具有1亿个节点的连接组件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

图中具有1亿个节点的连接组件 [英] Connected components in a graph with 100 million nodes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

图中具有1亿个节点的连接组件 [英] Connected components in a graph with 100 million nodes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭