Python中最有效的图形数据结构是什么？ [英] What is the most efficient graph data structure in Python?

查看：249 发布时间：2017/4/3 11:06:46 python performance data-structures graph-theory

本文介绍了Python中最有效的图形数据结构是什么？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要能够在python中操作一个大的（10 ^ 7个节点）图形。对应于每个节点/边缘的数据是最小的，例如，少量的字符串。在这种方式下，最有效率的是什么（ 内存和速度 ）？

dict的命令更灵活，更简单实现，但我直观地期望列表列表更快。列表选项还要求我将数据与结构保持分开，而dicts将允许使用以下类型：

 图[I] [J] [Property] =value

你会建议什么？

是的，我应该比较清楚我的意思是效率。在这种特殊情况下，我的意思是随机访问检索。

将数据加载到内存不是一个巨大的问题。这完成了一切。耗时的部分是访问节点，所以我可以提取信息并测量我感兴趣的指标。

我没有考虑使每个节点成为一个类（属性对于所有节点都是相同的），但是它似乎会增加额外的开销？我希望有一个人能够与他们可以分享的类似案例有一些直接的经验。毕竟，图形是CS中最常见的抽象之一。

解决方案

我强烈主张你看看

I need to be able to manipulate a large (10^7 nodes) graph in python. The data corresponding to each node/edge is minimal, say, a small number of strings. What is the most efficient, in terms of memory and speed, way of doing this?

A dict of dicts is more flexible and simpler to implement, but I intuitively expect a list of lists to be faster. The list option would also require that I keep the data separate from the structure, while dicts would allow for something of the sort:

graph[I][J]["Property"]="value"

What would you suggest?

Yes, I should have been a bit clearer on what I mean by efficiency. In this particular case I mean it in terms of random access retrieval.

Loading the data in to memory isn't a huge problem. That's done once and for all. The time consuming part is visiting the nodes so I can extract the information and measure the metrics I'm interested in.

I hadn't considered making each node a class (properties are the same for all nodes) but it seems like that would add an extra layer of overhead? I was hoping someone would have some direct experience with a similar case that they could share. After all, graphs are one of the most common abstractions in CS.

解决方案

I would strongly advocate you look at NetworkX. It's a battle-tested war horse and the first tool most 'research' types reach for when they need to do analysis of network based data. I have manipulated graphs with 100s of thousands of edges without problem on a notebook. Its feature rich and very easy to use. You will find yourself focusing more on the problem at hand rather than the details in the underlying implementation.

Example of Erdős-Rényi random graph generation and analysis


"""
Create an G{n,m} random graph with n nodes and m edges
and report some properties.

This graph is sometimes called the Erd##[m~Qs-Rényi graph
but is different from G{n,p} or binomial_graph which is also
sometimes called the Erd##[m~Qs-Rényi graph.
"""
__author__ = """Aric Hagberg (hagberg@lanl.gov)"""
__credits__ = """"""
#    Copyright (C) 2004-2006 by 
#    Aric Hagberg 
#    Dan Schult 
#    Pieter Swart 
#    Distributed under the terms of the GNU Lesser General Public License
#    http://www.gnu.org/copyleft/lesser.html

from networkx import *
import sys

n=10 # 10 nodes
m=20 # 20 edges

G=gnm_random_graph(n,m)

# some properties
print "node degree clustering"
for v in nodes(G):
    print v,degree(G,v),clustering(G,v)

# print the adjacency list to terminal 
write_adjlist(G,sys.stdout)

Visualizations are also straightforward:

More visualization: http://jonschull.blogspot.com/2008/08/graph-visualization.html

这篇关于Python中最有效的图形数据结构是什么？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python中最有效的图形数据结构是什么？ [英] What is the most efficient graph data structure in Python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python中最有效的图形数据结构是什么？ [英] What is the most efficient graph data structure in Python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭