Spark Scala-通过VertexID连接两个数组 [英] Spark Scala - Joining two arrays by VertexID

查看:118
本文介绍了Spark Scala-通过VertexID连接两个数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个以下格式的数组

I have 2 arrays in the following format

scala> cPV.take(5)
res18: Array[(org.apache.spark.graphx.VertexId, String)] = Array((-496366541,7804412), (183389035,11517829), (1300761459,36164965), (978932066,32135154), (370291237,40355685))

scala> fC.take(5)
res19: Array[(org.apache.spark.graphx.VertexId, Int)] = Array((386253628,1), (-1141923433,1), (1871855296,7), (1938255756,1), (-749015657,5))

我需要加入他们才能加入格式- Array [(org.apache.spark.graphx.VertexId,Int,String)]

I need to join them to get into the format - Array[(org.apache.spark.graphx.VertexId, Int, String)]

I尝试过.join()但会引发以下错误

I have tried .join() but it throws the following error

val mVP = fC.join(cPV)
<console>:64: error: value join is not a member of Array[(org.apache.spark.graphx.VertexId, Int)]
       val mVP = fC.join(cPV)

我也尝试过,它不起作用。

I also tried this and it didn't work.

推荐答案

我尝试了

val fCRDD = sc.parallelize(fC)
scala> val mVP = fCRDD.join(cPV)
mVP: org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, (Int, String))] = MapPartitionsRDD[106] at join at <console>:67

scala> mVP.take(5)
res21: Array[(org.apache.spark.graphx.VertexId, (Int, String))] = Array((-891966589,(4,D)), (166544732,(74,V)), (1871855296,(7,LG)), (1416009424,(6,Dck)), (-241988197,(4,L)))

抱歉,Noob在这里-在发布问题之前,我应该尝试过此操作。

Sorry, Noob here - I should have tried this before posting a question here.

这篇关于Spark Scala-通过VertexID连接两个数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆