在星火2 RDD的笛卡尔乘积 [英] Cartesian product of two RDD in Spark
问题描述
我是全新到Apache Spark和我试图笛卡尔产品分类二RDD。作为一个例子,我有A和B,如:
A = {(A1,V1),(A2,V2),...}
B = {(B1,S1),(B2,S2),...}
我需要一个新的RDD喜欢的:
C = {((A1,V1),(B1,S1)),((A1,V1),(B2,S2)),...}
任何想法,我该怎么办呢?尽可能简单:)
在此先感谢
PS:我终于做到了像这样由@Amit库马尔的建议:
笛卡儿积= A.cartesian(B)
这不是点的产品,这就是笛卡尔乘积。使用笛卡尔
方法:
DEF笛卡尔[U](其它:spark.api.java.JavaRDDLike [U _]):JavaPairRDD [T,U]
返回此RDD,另一个的笛卡尔乘积,即所有元素对的RDD(A,B),其中一个是在
这个
b是在等
。
块引用>I am completely new to Apache Spark and I trying to Cartesian product two RDD. As an example I have A and B like :
A = {(a1,v1),(a2,v2),...} B = {(b1,s1),(b2,s2),...}
I need a new RDD like:
C = {((a1,v1),(b1,s1)), ((a1,v1),(b2,s2)), ...}
Any idea how I can do this? As simple as possible :)
Thanks in advance
PS: I finally did it like this as suggested by @Amit Kumar:
cartesianProduct = A.cartesian(B)
解决方案That's not the dot product, that's the cartesian product. Use the
cartesian
method:def cartesian[U](other: spark.api.java.JavaRDDLike[U, _]): JavaPairRDD[T, U]
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in
this
and b is inother
.这篇关于在星火2 RDD的笛卡尔乘积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!