RDD和Pair RDD的区别和用例 [英] Difference and use-cases of RDD and Pair RDD
问题描述
我是新手,并尝试了解普通RDD和一对RDD之间的区别.使用一对RDD而不是普通RDD的用例是什么?如果可能的话,我想通过一个例子来理解RDD对的内部.谢谢
I am new to spark and trying to understand the difference between normal RDD and a pair RDD. What are the use-cases where a pair RDD is used as opposed to a normal RDD? If possible, I want to understand the internals of pair RDD with an example. Thanks
推荐答案
主要区别在于:
pairRDD操作(例如map,reduceByKey等)产生键,值对.而对RDD的操作(例如flatMap或reduce)可以为您提供值的集合或单个值
pairRDD operations (such as map, reduceByKey etc) produce key,value pairs. Whereas operations on RDD(such as flatMap or reduce) gives you a collection of values or a single value
pairRDD操作并行应用于每个键/元素.RDD上的操作(如flatMap)应用于整个集合.
pairRDD operations are applied on each key/element in parallel.Operations on RDD (like flatMap) are applied to the whole collection.
这篇关于RDD和Pair RDD的区别和用例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!