阿帕奇graphx合并/合并多个图表 [英] apache graphx merge/combine multiple graphs
问题描述
我是新来的Apache GraphX,我想看看我能做到图形合并/在graphX结合起来。
我想要做的是说我有2个图如下
I'm new to Apache GraphX and I want to see if I can do graph merge/combine in graphX. What I want to do is say I have 2 graph as below
graph1: A —1—> B —1—> C —1—> D
|
—1—> E —1—> F
graph2: A —1—> B —1—> C
|
—1—> G
和我想要得到合并/合并导致像
and I want to get merge/combine result like
merge result: A —2—> B —2—> C —1—> D
|
—1—> E —1—> F
|
—1—> G
我可以在Neo4j的做到这一点的嵌入式graphDB与Path对象比较路径,积累边缘计数和失踪路径加入。
I can do this in Neo4j embedded graphDB with Path object to compare path, accumulate edge count and join in missing path.
反正是有或示例,可以帮助我做同样的事情在GraphX?
Is there anyway or example that can help me do the same thing in GraphX?
感谢
推荐答案
您需要根据顶点和边的联合创建一个新的图形,然后用groupEdges():
You need to create a new graph based on a union of the vertices and edges and then use groupEdges():
import org.apache.spark.graphx._
import org.apache.spark.graphx.PartitionStrategy.RandomVertexCut
val verts1 = sc.parallelize(Seq(
(1L,"A"),
(2L,"B"),
(3L,"C"),
(4L,"D"),
(5L,"E"),
(6L,"F")))
val edges1 = sc.parallelize(Seq(
Edge(1L,2L,1),
Edge(2L,3L,1),
Edge(3L,4L,1),
Edge(1L,5L,1),
Edge(5L,6L,1)))
val graph1 = Graph(verts1, edges1)
val verts2 = sc.parallelize(Seq(
(1L,"A"),
(2L,"B"),
(3L,"C"),
(7L,"G")))
val edges2 = sc.parallelize(Seq(
Edge(1L,2L,1),
Edge(2L,3L,1),
Edge(1L,7L,1)))
val graph2 = Graph(verts2, edges2)
val graph: Graph[String,Int] = Graph(
graph1.vertices.union(graph2.vertices),
graph1.edges.union(graph2.edges)
).partitionBy(RandomVertexCut).
groupEdges( (attr1, attr2) => attr1 + attr2 )
如果你现在看看这个新图可以看到合并结果的边缘:
If you now look at the edges of this new graph you can see the merge results:
scala> graph.edges.collect
res0: Array[org.apache.spark.graphx.Edge[Int]] =
Array(Edge(1,2,2), Edge(2,3,2), Edge(1,5,1),
Edge(5,6,1), Edge(1,7,1), Edge(3,4,1))
这篇关于阿帕奇graphx合并/合并多个图表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!