高效地查找大图中的最短路径 [英] Efficiently finding the shortest path in large graphs
问题描述
添加:
这些评论让我很好奇,pygraph的性能如何影响OP的顺序,所以我制作了一个玩具程序来查明。这是问题的一个稍微小版本的输出:
pre $ $ $ c $ $ $ python2.6 biggraph.py 4 6
biggraph生成10000个节点00:00:00
biggraph生成1000000个边缘00:00:00
biggraph增加边缘00:00:05
biggraph Dijkstra 00:01:32
biggraph shortest_path完成00:04:15
step:1915 2
step:0 1
biggraph walk done 00:04:15
path:[9999,1915,0]
对于10k个节点和1M个边缘不会太糟糕。重要的是要注意,Dijkstra的方式是通过pygraph计算的,为每个节点相对于一个目标生成所有生成树的字典(该节点是任意节点0,并且在图中不具有特权位置)。因此,花费3.75分钟计算的解决方案实际上得出了从所有节点到目标的最短路径是多少的答案。事实上,一旦完成了 shortest_path
,行走答案仅仅是字典查找,并且几乎没有时间。还值得注意的是,在约1.5分钟处将预先计算的边缘添加到图形上是相当昂贵的。这些时间在多次运行中是一致的。
我想说这个过程很好,但我仍然在等待 biggraph 5在一台空闲的计算机上运行(
)(Athlon 64,4800每个处理器 BogoMIPS ,所有核心)已经运行了一个多小时。至少内存使用稳定在0.5GB左右。结果如下:
biggraph生成100000个节点00:00:00
biggraph生成1000000个边00:00 :00
biggraph加边00:00:07
biggraph Dijkstra 00:01:27
biggraph shortest_path done 00:23:44
step:48437 4
step :66200 3
step:83824 2
step:0 1
biggraph walk done 00:23:44
path:[99999,48437,66200,83824,0]
这段时间很长,但它也是一个沉重的计算(我真的希望我会腌制结果)。这是好奇的代码:
#!/ usr / bin / python
import pygraph。 classes.graph
import pygraph.algorithms
import pygraph.algorithms.minmax
import time
import random
import sys
if len( sys.argv)!= 3:
print('usage%s:node_exponent edge_exponent'%sys.argv [0])
sys.exit(1)
nnodes = 10 ** int(sys.argv [1])$ b $ b nedges = 10 ** int(sys.argv [2])
start_time = time.clock()
def timestamp(s):
t = time.gmtime(time.clock() - start_time)
print'biggraph',s.ljust(24),time.strftime('%H:%M:% S',t)
timestamp('generate%d nodes'%nnodes)
bg = pygraph.classes.graph.graph()
bg.add_nodes(xrange(nnodes ))
timestamp('generate%d edges'%nedges)
edges = set()
while len(edges)< nedges:
left,right = random.randrange(nnodes),random.randrange(nnodes)
if left == right:
continue
elif left>右:
左边,右边=右边,左边
edges.add((左边,右边))
时间戳('添加边缘')
边缘边缘:
bg.add_edge(edge)
timestamp(Dijkstra)
target = 0
span,dist = pygraph.algorithms.minmax.shortest_path(bg,目标)
timestamp('shortest_path done')
#从任何节点到目标的路径都在字典范围内,让我们
#选择任意节点(最后一个)和从那里步行到
#目标,相关距离将单调减少
#
lastnode = nnodes - 1
path = []
while lastnode!= target:
nextnode = span [lastnode]
print'step:',nextnode,dist [lastnode]
assert nextnode in bg.neighbors(lastnode)
path.append(lastnode)
lastnode = nextnode
path.append(target)
timestamp('walk done')
print'path:',path
I'm looking to find a way to in real-time find the shortest path between nodes in a huge graph. It has hundreds of thousands of vertices and millions of edges. I know this question has been asked before and I guess the answer is to use a breadth-first search, but I'm more interested in to know what software you can use to implement it. For example, it would be totally perfect if it already exist a library (with python bindings!) for performing bfs in undirected graphs.
added:
The comments made me curious as to how the performance of pygraph was for a problem on the order of the OP, so I made a toy program to find out. Here's the output for a slightly smaller version of the problem:
$ python2.6 biggraph.py 4 6
biggraph generate 10000 nodes 00:00:00
biggraph generate 1000000 edges 00:00:00
biggraph add edges 00:00:05
biggraph Dijkstra 00:01:32
biggraph shortest_path done 00:04:15
step: 1915 2
step: 0 1
biggraph walk done 00:04:15
path: [9999, 1915, 0]
Not too bad for 10k nodes and 1M edges. It is important to note that the way Dijkstra's is computed by pygraph yields a dictionary of all spanning trees for each node relative to one target (which was arbitrarily node 0, and holds no privileged position in the graph). Therefore, the solution that took 3.75 minutes to compute actually yielded the answer to "what is the shortest path from all nodes to the target?". Indeed once shortest_path
was done, walking the answer was mere dictionary lookups and took essentially no time. It is also worth noting that adding the pre-computed edges to the graph was rather expensive at ~1.5 minutes. These timings are consistent across multiple runs.
I'd like to say that the process scales well, but I'm still waiting on biggraph 5 6
on an otherwise idled computer (Athlon 64, 4800 BogoMIPS per processor, all in core) which has been running for over a quarter hour. At least the memory use is stable at about 0.5GB. And the results are in:
biggraph generate 100000 nodes 00:00:00
biggraph generate 1000000 edges 00:00:00
biggraph add edges 00:00:07
biggraph Dijkstra 00:01:27
biggraph shortest_path done 00:23:44
step: 48437 4
step: 66200 3
step: 83824 2
step: 0 1
biggraph walk done 00:23:44
path: [99999, 48437, 66200, 83824, 0]
That's a long time, but it was also a heavy computation (and I really wish I'd pickled the result). Here's the code for the curious:
#!/usr/bin/python
import pygraph.classes.graph
import pygraph.algorithms
import pygraph.algorithms.minmax
import time
import random
import sys
if len(sys.argv) != 3:
print ('usage %s: node_exponent edge_exponent' % sys.argv[0])
sys.exit(1)
nnodes = 10**int(sys.argv[1])
nedges = 10**int(sys.argv[2])
start_time = time.clock()
def timestamp(s):
t = time.gmtime(time.clock() - start_time)
print 'biggraph', s.ljust(24), time.strftime('%H:%M:%S', t)
timestamp('generate %d nodes' % nnodes)
bg = pygraph.classes.graph.graph()
bg.add_nodes(xrange(nnodes))
timestamp('generate %d edges' % nedges)
edges = set()
while len(edges) < nedges:
left, right = random.randrange(nnodes), random.randrange(nnodes)
if left == right:
continue
elif left > right:
left, right = right, left
edges.add((left, right))
timestamp('add edges')
for edge in edges:
bg.add_edge(edge)
timestamp("Dijkstra")
target = 0
span, dist = pygraph.algorithms.minmax.shortest_path(bg, target)
timestamp('shortest_path done')
# the paths from any node to target is in dict span, let's
# pick any arbitrary node (the last one) and walk to the
# target from there, the associated distance will decrease
# monotonically
lastnode = nnodes - 1
path = []
while lastnode != target:
nextnode = span[lastnode]
print 'step:', nextnode, dist[lastnode]
assert nextnode in bg.neighbors(lastnode)
path.append(lastnode)
lastnode = nextnode
path.append(target)
timestamp('walk done')
print 'path:', path
这篇关于高效地查找大图中的最短路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!