Hadoop MapReduce实现图中最短的PATH,而不仅仅是距离 [英] Hadoop MapReduce implementation of shortest PATH in a graph, not just the distance

查看:109
本文介绍了Hadoop MapReduce实现图中最短的PATH,而不仅仅是距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找MapReduce实现最短路径搜索算法。然而,我可以找到的所有实例计算从节点x到y的最短距离,并且没有实际输出实际最短路径,如xabcy / strong>。



至于我试图实现的是,我有几百个节点的图形,我需要对最短路径执行频繁模式分析在各个节点之间。这是我正在研究的一个研究项目。



如果有人能指出某些实现(如果存在)或给出一些指示至于如何破解现有的SSSP实现来生成路径以及距离。

解决方案

基本上这些实现与某种消息类型一起工作。因此消息会在地图和缩小阶段之间发送到HDFS。



在缩减器中,它们按距离进行分组和过滤,最小距离获胜。在这种情况下更新距离时,必须设置消息来源的顶点(可能是某个ID)。

因此,每个顶点都有额外的空间需求,但是您可以重构图中每条可能的最短路径。



根据您的评论:

可能

< blockquote>

我需要编写另一个顶点对象类来存放这个
的附加信息。感谢您的提示,但如果您能指出何时何地可以捕获最低重量来源的
信息,那么您的博客中的任何内容都可能是:$) p>

是的,对于Apache Hama也可能是一个很酷的主题。大多数实现只考虑成本而不是真正的路径。在你的情况下(从你上面链接的博客),你将不得不提取一个实际上保存相邻顶点的顶点类为 LongWritable (也许是一个列表,而不是这个分割在文本对象上),并简单地添加一个父或源ID作为字段(当然还有 LongWritable )。
您将在映射器中传播时设置它,即循环遍历当前关键节点的相邻顶点的循环。



在reducer你会在遍历分组值的时候更新某个地方的最低点,那么你必须在顶点的顶点设置顶点的顶点,并且更新到最小值。



实际上,您可以从我的博客获得一些顶点类:
来源
或直接从存储库:
来源

也许它可以帮助你,这是相当无法维护,所以如果你有一个具体的问题请回到我身边。


以下是与Apache Hama一起使用的BSP算法: $ b

https://github.com/thomasjungblut/tjungblut-graph/blob/ master / src / de / jungblut / graph / bsp / SSSP.java

I have been looking for "MapReduce implementation of Shortest path search algorithms".

However, all the instances I could find "computed the shortest distance form node x to y", and none actually output the "actual shortest path like x-a-b-c-y".

As for what am I trying to achieve is that I have graphs with hundreds of 1000s of nodes and I need to perform frequent pattern analysis on shortest paths among the various nodes. This is for a research project I am working on.

It would be a great help if some one could point me to some implementation (if it exists) or give some pointers as to how to hack the existing SSSP implementations to generate the paths along with the distances.

解决方案

Basically these implementations work with some kind of messaging. So messages are send to HDFS between map and reduce stage.

In the reducer they are grouped and filtered by distance, the lowest distance wins. When the distance is updated in this case, you have to set the vertex (well, some ID probably) where the message came from.

So you have additional space requirement per vertex, but you can reconstruct every possible shortest path in the graph.

Based on your comment:

yes probably

I will need to write another class of the vertex object to hold this additional information. Thanks for the tip, though it would be very helpful if you could point out where and when I can capture this information of where the minimum weight came from, anything from your blog maybe :-)

Yea, could be a quite cool theme, also for Apache Hama. Most of the implementations are just considering the costs not the real path. In your case (from the blog you've linked above) you will have to extract a vertex class which actually holds the adjacent vertices as LongWritable (maybe a list instead of this split on the text object) and simply add a parent or source id as field (of course also LongWritable). You will set this when propagating in the mapper, that is the for loop that is looping over the adjacent vertices of the current key node.

In the reducer you will update the lowest somewhere while iterating over the grouped values, there you will have to set the source vertex in the key vertex by the vertex that updated to the minimum.

You can actually get some of the vertices classes here from my blog: Source or directly from the repository: Source

Maybe it helps you, it is quite unmaintained so please come back to me if you have a specific question.

Here is the same algorithm in BSP with Apache Hama:

https://github.com/thomasjungblut/tjungblut-graph/blob/master/src/de/jungblut/graph/bsp/SSSP.java

这篇关于Hadoop MapReduce实现图中最短的PATH,而不仅仅是距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆