在大图中有效地找到最短路径 [英] Efficiently finding the shortest path in large graphs

查看:39
本文介绍了在大图中有效地找到最短路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种方法来实时找到巨大图中节点之间的最短路径.它有数十万个顶点和数百万条边.我知道以前有人问过这个问题,我想答案是使用广度优先搜索,但我更想知道您可以使用什么软件来实现它.例如,如果它已经存在一个用于在无向图中执行 bfs 的库(带有 python 绑定!),那将是完全完美的.

I'm looking to find a way to in real-time find the shortest path between nodes in a huge graph. It has hundreds of thousands of vertices and millions of edges. I know this question has been asked before and I guess the answer is to use a breadth-first search, but I'm more interested in to know what software you can use to implement it. For example, it would be totally perfect if it already exist a library (with python bindings!) for performing bfs in undirected graphs.

推荐答案

python-graph

添加:

评论让我很好奇pygraph的性能对于OP的顺序有什么问题,所以我做了一个玩具程序来找出答案.以下是该问题的较小版本的输出:

The comments made me curious as to how the performance of pygraph was for a problem on the order of the OP, so I made a toy program to find out. Here's the output for a slightly smaller version of the problem:

$ python2.6 biggraph.py 4 6
biggraph generate 10000 nodes     00:00:00
biggraph generate 1000000 edges   00:00:00
biggraph add edges                00:00:05
biggraph Dijkstra                 00:01:32
biggraph shortest_path done       00:04:15
step: 1915 2
step: 0 1
biggraph walk done                00:04:15
path: [9999, 1915, 0]

对于 10k 节点和 1M 边来说还不错.重要的是要注意,pygraph 计算 Dijkstra 的方式为每个节点相对于一个目标(任意节点 0,并且在图中没有特权位置)生成所有生成树的字典.因此,计算耗时 3.75 分钟的解决方案实际上给出了从所有节点到目标的最短路径是什么?"的答案.事实上,一旦 shortest_path 完成,走答案只是字典查找,基本上没有时间.还值得注意的是,将预先计算的边添加到图中的成本在大约 1.5 分钟时相当昂贵.这些时间在多次运行中是一致的.

Not too bad for 10k nodes and 1M edges. It is important to note that the way Dijkstra's is computed by pygraph yields a dictionary of all spanning trees for each node relative to one target (which was arbitrarily node 0, and holds no privileged position in the graph). Therefore, the solution that took 3.75 minutes to compute actually yielded the answer to "what is the shortest path from all nodes to the target?". Indeed once shortest_path was done, walking the answer was mere dictionary lookups and took essentially no time. It is also worth noting that adding the pre-computed edges to the graph was rather expensive at ~1.5 minutes. These timings are consistent across multiple runs.

我想说这个过程可以很好地扩展,但我仍在等待 biggraph 5 6 在其他闲置的计算机上(Athlon 64, 4800 BogoMIPS 每个处理器,全部在核心中)已经运行了超过四分之一小时.至少内存使用稳定在0.5GB左右.结果如下:

I'd like to say that the process scales well, but I'm still waiting on biggraph 5 6 on an otherwise idled computer (Athlon 64, 4800 BogoMIPS per processor, all in core) which has been running for over a quarter hour. At least the memory use is stable at about 0.5GB. And the results are in:

biggraph generate 100000 nodes    00:00:00
biggraph generate 1000000 edges   00:00:00
biggraph add edges                00:00:07
biggraph Dijkstra                 00:01:27
biggraph shortest_path done       00:23:44
step: 48437 4
step: 66200 3
step: 83824 2
step: 0 1
biggraph walk done                00:23:44
path: [99999, 48437, 66200, 83824, 0]

那是一段很长的时间,但这也是一个繁重的计算(我真的希望我能对结果进行腌制).这是好奇的代码:

That's a long time, but it was also a heavy computation (and I really wish I'd pickled the result). Here's the code for the curious:

#!/usr/bin/python

import pygraph.classes.graph
import pygraph.algorithms
import pygraph.algorithms.minmax
import time
import random
import sys

if len(sys.argv) != 3:
    print ('usage %s: node_exponent edge_exponent' % sys.argv[0])
    sys.exit(1)

nnodes = 10**int(sys.argv[1])
nedges = 10**int(sys.argv[2])

start_time = time.clock()
def timestamp(s):
    t = time.gmtime(time.clock() - start_time)
    print 'biggraph', s.ljust(24), time.strftime('%H:%M:%S', t)

timestamp('generate %d nodes' % nnodes)
bg = pygraph.classes.graph.graph()
bg.add_nodes(xrange(nnodes))

timestamp('generate %d edges' % nedges)
edges = set()
while len(edges) < nedges:
    left, right = random.randrange(nnodes), random.randrange(nnodes)
    if left == right:
        continue
    elif left > right:
        left, right = right, left
    edges.add((left, right))

timestamp('add edges')
for edge in edges:
    bg.add_edge(edge)

timestamp("Dijkstra")
target = 0
span, dist = pygraph.algorithms.minmax.shortest_path(bg, target)
timestamp('shortest_path done')

# the paths from any node to target is in dict span, let's
# pick any arbitrary node (the last one) and walk to the
# target from there, the associated distance will decrease
# monotonically
lastnode = nnodes - 1
path = []
while lastnode != target:
    nextnode = span[lastnode]
    print 'step:', nextnode, dist[lastnode]
    assert nextnode in bg.neighbors(lastnode)
    path.append(lastnode)
    lastnode = nextnode
path.append(target)
timestamp('walk done')
print 'path:', path

这篇关于在大图中有效地找到最短路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆