高效地查找大图中的最短路径 [英] Efficiently finding the shortest path in large graphs

查看:192
本文介绍了高效地查找大图中的最短路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种方法来实时查找巨大图形中节点之间的最短路径。它有数十万个顶点和数百万条边。我知道这个问题之前已经被问过了,我想答案是使用广度优先搜索,但我更感兴趣的是知道您可以使用哪些软件来实现它。例如,如果它已经存在一个用于在无向图中执行bfs的库(带有python绑定!),那将是完全不错的。

p> python-graph



添加:



这些评论让我很好奇,pygraph的性能如何影响OP的顺序,所以我制作了一个玩具程序来查明。这是问题的一个稍微小版本的输出:

pre $ $ $ c $ $ $ python2.6 biggraph.py 4 6
biggraph生成10000个节点00:00:00
biggraph生成1000000个边缘00:00:00
biggraph增加边缘00:00:05
biggraph Dijkstra 00:01:32
biggraph shortest_path完成00:04:15
step:1915 2
step:0 1
biggraph walk done 00:04:15
path:[9999,1915,0]

对于10k个节点和1M个边缘不会太糟糕。重要的是要注意,Dijkstra的方式是通过pygraph计算的,为每个节点相对于一个目标生成所有生成树的字典(该节点是任意节点0,并且在图中不具有特权位置)。因此,花费3.75分钟计算的解决方案实际上得出了从所有节点到目标的最短路径是多少的答案。事实上,一旦完成了 shortest_path ,行走答案仅仅是字典查找,并且几乎没有时间。还值得注意的是,在约1.5分钟处将预先计算的边缘添加到图形上是相当昂贵的。这些时间在多次运行中是一致的。



我想说这个过程很好,但我仍然在等待 biggraph 5在一台空闲的计算机上运行()(Athlon 64,4800每个处理器 BogoMIPS ,所有核心)已经运行了一个多小时。至少内存使用稳定在0.5GB左右。结果如下:

  biggraph生成100000个节点00:00:00 
biggraph生成1000000个边00:00 :00
biggraph加边00:00:07
biggraph Dijkstra 00:01:27
biggraph shortest_path done 00:23:44
step:48437 4
step :66200 3
step:83824 2
step:0 1
biggraph walk done 00:23:44
path:[99999,48437,66200,83824,0]

这段时间很长,但它也是一个沉重的计算(我真的希望我会腌制结果)。这是好奇的代码:

 #!/ usr / bin / python 

import pygraph。 classes.graph
import pygraph.algorithms
import pygraph.algorithms.minmax
import time
import random
import sys

if len( sys.argv)!= 3:
print('usage%s:node_exponent edge_exponent'%sys.argv [0])
sys.exit(1)

nnodes = 10 ** int(sys.argv [1])$ ​​b $ b nedges = 10 ** int(sys.argv [2])

start_time = time.clock()
def timestamp(s):
t = time.gmtime(time.clock() - start_time)
print'biggraph',s.ljust(24),time.strftime('%H:%M:% S',t)

timestamp('generate%d nodes'%nnodes)
bg = pygraph.classes.graph.graph()
bg.add_nodes(xrange(nnodes ))

timestamp('generate%d edges'%nedges)
edges = set()
while len(edges)< nedges:
left,right = random.randrange(nnodes),random.randrange(nnodes)
if left == right:
continue
elif left>右:
左边,右边=右边,左边
edges.add((左边,右边))

时间戳('添加边缘')
边缘边缘:
bg.add_edge(edge)

timestamp(Dijkstra)
target = 0
span,dist = pygraph.algorithms.minmax.shortest_path(bg,目标)
timestamp('shortest_path done')

#从任何节点到目标的路径都在字典范围内,让我们
#选择任意节点(最后一个)和从那里步行到
#目标,相关距离将单调减少

lastnode = nnodes - 1
path = []
while lastnode!= target:
nextnode = span [lastnode]
print'step:',nextnode,dist [lastnode]
assert nextnode in bg.neighbors(lastnode)
path.append(lastnode)
lastnode = nextnode
path.append(target)
timestamp('walk done')
print'path:',path


I'm looking to find a way to in real-time find the shortest path between nodes in a huge graph. It has hundreds of thousands of vertices and millions of edges. I know this question has been asked before and I guess the answer is to use a breadth-first search, but I'm more interested in to know what software you can use to implement it. For example, it would be totally perfect if it already exist a library (with python bindings!) for performing bfs in undirected graphs.

解决方案

python-graph

added:

The comments made me curious as to how the performance of pygraph was for a problem on the order of the OP, so I made a toy program to find out. Here's the output for a slightly smaller version of the problem:

$ python2.6 biggraph.py 4 6
biggraph generate 10000 nodes     00:00:00
biggraph generate 1000000 edges   00:00:00
biggraph add edges                00:00:05
biggraph Dijkstra                 00:01:32
biggraph shortest_path done       00:04:15
step: 1915 2
step: 0 1
biggraph walk done                00:04:15
path: [9999, 1915, 0]

Not too bad for 10k nodes and 1M edges. It is important to note that the way Dijkstra's is computed by pygraph yields a dictionary of all spanning trees for each node relative to one target (which was arbitrarily node 0, and holds no privileged position in the graph). Therefore, the solution that took 3.75 minutes to compute actually yielded the answer to "what is the shortest path from all nodes to the target?". Indeed once shortest_path was done, walking the answer was mere dictionary lookups and took essentially no time. It is also worth noting that adding the pre-computed edges to the graph was rather expensive at ~1.5 minutes. These timings are consistent across multiple runs.

I'd like to say that the process scales well, but I'm still waiting on biggraph 5 6 on an otherwise idled computer (Athlon 64, 4800 BogoMIPS per processor, all in core) which has been running for over a quarter hour. At least the memory use is stable at about 0.5GB. And the results are in:

biggraph generate 100000 nodes    00:00:00
biggraph generate 1000000 edges   00:00:00
biggraph add edges                00:00:07
biggraph Dijkstra                 00:01:27
biggraph shortest_path done       00:23:44
step: 48437 4
step: 66200 3
step: 83824 2
step: 0 1
biggraph walk done                00:23:44
path: [99999, 48437, 66200, 83824, 0]

That's a long time, but it was also a heavy computation (and I really wish I'd pickled the result). Here's the code for the curious:

#!/usr/bin/python

import pygraph.classes.graph
import pygraph.algorithms
import pygraph.algorithms.minmax
import time
import random
import sys

if len(sys.argv) != 3:
    print ('usage %s: node_exponent edge_exponent' % sys.argv[0])
    sys.exit(1)

nnodes = 10**int(sys.argv[1])
nedges = 10**int(sys.argv[2])

start_time = time.clock()
def timestamp(s):
    t = time.gmtime(time.clock() - start_time)
    print 'biggraph', s.ljust(24), time.strftime('%H:%M:%S', t)

timestamp('generate %d nodes' % nnodes)
bg = pygraph.classes.graph.graph()
bg.add_nodes(xrange(nnodes))

timestamp('generate %d edges' % nedges)
edges = set()
while len(edges) < nedges:
    left, right = random.randrange(nnodes), random.randrange(nnodes)
    if left == right:
        continue
    elif left > right:
        left, right = right, left
    edges.add((left, right))

timestamp('add edges')
for edge in edges:
    bg.add_edge(edge)

timestamp("Dijkstra")
target = 0
span, dist = pygraph.algorithms.minmax.shortest_path(bg, target)
timestamp('shortest_path done')

# the paths from any node to target is in dict span, let's
# pick any arbitrary node (the last one) and walk to the
# target from there, the associated distance will decrease
# monotonically
lastnode = nnodes - 1
path = []
while lastnode != target:
    nextnode = span[lastnode]
    print 'step:', nextnode, dist[lastnode]
    assert nextnode in bg.neighbors(lastnode)
    path.append(lastnode)
    lastnode = nextnode
path.append(target)
timestamp('walk done')
print 'path:', path

这篇关于高效地查找大图中的最短路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆