在星火2 RDD的笛卡尔乘积 [英] Cartesian product of two RDD in Spark

查看:318
本文介绍了在星火2 RDD的笛卡尔乘积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是全新到Apache Spark和我试图笛卡尔产品分类二RDD。作为一个例子,我有A和B,如:

  A = {(A1,V1),(A2,V2),...}
B = {(B1,S1),(B2,S2),...}

我需要一个新的RDD喜欢的:

  C = {((A1​​,V1),(B1,S1)),((A1,V1),(B2,S2)),...}

任何想法,我该怎么办呢?尽可能简单:)

在此先感谢

PS:我终于做到了像这样由@Amit库马尔的建议:

笛卡儿积= A.cartesian(B)


解决方案

这不是点的产品,这就是笛卡尔乘积。使用笛卡尔方法:

  DEF笛卡尔[U](其它:spark.api.java.JavaRDDLike [U _]):JavaPairRDD [T,U]


  

返回此RDD,另一个的笛卡尔乘积,即所有元素对的RDD(A,B),其中一个是在这个 b是在


来源

I am completely new to Apache Spark and I trying to Cartesian product two RDD. As an example I have A and B like :

A = {(a1,v1),(a2,v2),...}
B = {(b1,s1),(b2,s2),...}

I need a new RDD like:

C = {((a1,v1),(b1,s1)), ((a1,v1),(b2,s2)), ...}

Any idea how I can do this? As simple as possible :)

Thanks in advance

PS: I finally did it like this as suggested by @Amit Kumar:

cartesianProduct = A.cartesian(B)

解决方案

That's not the dot product, that's the cartesian product. Use the cartesian method:

def cartesian[U](other: spark.api.java.JavaRDDLike[U, _]): JavaPairRDD[T, U]

Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.

Source

这篇关于在星火2 RDD的笛卡尔乘积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆