适用于非常大图的 A* 算法,对缓存快捷方式有什么想法吗? [英] A* Algorithm for very large graphs, any thoughts on caching shortcuts?

查看:18
本文介绍了适用于非常大图的 A* 算法,对缓存快捷方式有什么想法吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 OpenStreetMap 地图上编写快递/物流模拟,并意识到如下图所示的基本 A* 算法对于大型地图(如大伦敦)来说不够快.

I'm writing a courier/logistics simulation on OpenStreetMap maps and have realised that the basic A* algorithm as pictured below is not going to be fast enough for large maps (like Greater London).

绿色节点对应的是放在开放集/优先级队列中的节点,由于数量庞大(整个地图大约有 1-2 百万个),找到图中的路线需要 5 秒左右.不幸的是,每条路线 100 毫秒大约是我的绝对限制.

The green nodes correspond to ones that were put in the open set/priority queue and due to the huge number (the whole map is something like 1-2 million), it takes 5 seconds or so to find the route pictured. Unfortunately 100ms per route is about my absolute limit.

目前,节点既存储在邻接列表中,也存储在空间 100x100 二维数组中.

Currently, the nodes are stored in both an adjacency list and also a spatial 100x100 2D array.

我正在寻找可以权衡预处理时间、空间以及必要时优化路线的方法,以实现更快的查询.根据分析器,启发式成本的直线 Haversine 公式是最昂贵的函数 - 我已经尽可能地优化了我的基本 A*.

I'm looking for methods where I can trade off preprocessing time, space and if needed optimality of the route, for faster queries. The straight-line Haversine formula for the heuristic cost is the most expensive function according to the profiler - I have optimised my basic A* as much as I can.

例如,我想如果我从二维数组的每个象限中选择一个任意节点 X 并在每个象限之间运行 A*,我可以将路由存储到磁盘以供后续模拟.查询时,我只能在象限中​​运行 A* 搜索,以便在预先计算的路线和 X 之间找到.

For example, I was thinking if I chose an arbitrary node X from each quadrant of the 2D array and run A* between each, I can store the routes to disk for subsequent simulations. When querying, I can run A* search only in the quadrants, to get between the precomputed route and the X.

是否有我上面描述的更精致的版本,或者我应该采用的不同方法.非常感谢!

Is there a more refined version of what I've described above or perhaps a different method I should pursue. Many thanks!

为了记录,这里有一些基准测试结果,用于任意加权启发式成本并计算 10 对随机选取的节点之间的路径:

For the record, here are some benchmark results for arbitrarily weighting the heuristic cost and computing the path between 10 pairs of randomly picked nodes:

Weight // AvgDist% // Time (ms)
1       1       1461.2
1.05    1       1327.2
1.1     1       900.7
1.2     1.019658848     196.4
1.3     1.027619169     53.6
1.4     1.044714394     33.6
1.5     1.063963413     25.5
1.6     1.071694171     24.1
1.7     1.084093229     24.3
1.8     1.092208509     22
1.9     1.109188175     22.5
2       1.122856792     18.2
2.2     1.131574742     16.9
2.4     1.139104895     15.4
2.6     1.140021962     16
2.8     1.14088128      15.5
3       1.156303676     16
4       1.20256964      13
5       1.19610861      12.9

令人惊讶的是,将系数增加到 1.1 几乎使执行时间减半,同时保持相同的路线.

Surprisingly increasing the coefficient to 1.1 almost halved the execution time whilst keeping the same route.

推荐答案

您应该能够通过权衡优化来使其更快.请参阅维基百科上的可接受性和最优性.

You should be able to make it much faster by trading off optimality. See Admissibility and optimality on wikipedia.

想法是使用 epsilon 值,这将导致解决方案不比最佳路径的 1 + epsilon 倍差,但会导致更少的节点由算法考虑.请注意,这并不意味着返回的解决方案将始终是最佳路径的 1 + epsilon 倍.这只是最坏的情况.对于您的问题,我不知道它在实践中的确切表现,但我认为值得探索.

The idea is to use an epsilon value which will lead to a solution no worse than 1 + epsilon times the optimal path, but which will cause fewer nodes to be considered by the algorithm. Note that this does not mean that the returned solution will always be 1 + epsilon times the optimal path. This is just the worst case. I don't know exactly how it would behave in practice for your problem, but I think it is worth exploring.

您在维基百科上获得了许多依赖于这个想法的算法.我相信这是改进算法的最佳选择,并且它有可能在您的时间限制内运行,同时仍然返回良好的路径.

You are given a number of algorithms that rely on this idea on wikipedia. I believe this is your best bet to improve the algorithm and that it has the potential to run in your time limit while still returning good paths.

由于您的算法确实在 5 秒内处理了数百万个节点,因此我假设您也使用二进制堆来实现,对吗?如果您手动实现它们,请确保它们被实现为简单数组并且它们是二进制堆.

Since your algorithm does deal with millions of nodes in 5 seconds, I assume you also use binary heaps for the implementation, correct? If you implemented them manually, make sure they are implemented as simple arrays and that they are binary heaps.

这篇关于适用于非常大图的 A* 算法,对缓存快捷方式有什么想法吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆