如何扁平化与星火/斯卡拉的集合? [英] How to flatten a collection with Spark/Scala?

查看:134
本文介绍了如何扁平化与星火/斯卡拉的集合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在斯卡拉我使用可以展开一个集合:

In Scala I can flatten a collection using :

val array = Array(List("1,2,3").iterator,List("1,4,5").iterator)
                                                  //> array  : Array[Iterator[String]] = Array(non-empty iterator, non-empty itera
                                                  //| tor)


    array.toList.flatten                      //> res0: List[String] = List(1,2,3, 1,4,5)

但我怎么能执行星火相似?

But how can I perform similar in Spark ?

阅读API文档<一个href=\"http://spark.apache.org/docs/0.7.3/api/core/index.html#spark.RDD\">http://spark.apache.org/docs/0.7.3/api/core/index.html#spark.RDD似乎没有要提供此功能的方法是什么?

Reading the API doc http://spark.apache.org/docs/0.7.3/api/core/index.html#spark.RDD there does not seem to be a method which provides this functionality ?

推荐答案

尝试flatMap具有标识映射功能( Y =&GT; Y

Try flatMap with an identity map function (y => y):

scala> val x = sc.parallelize(List(List("a"), List("b"), List("c", "d")))
x: org.apache.spark.rdd.RDD[List[String]] = ParallelCollectionRDD[1] at parallelize at <console>:12

scala> x.collect()
res0: Array[List[String]] = Array(List(a), List(b), List(c, d))

scala> x.flatMap(y => y)
res3: org.apache.spark.rdd.RDD[String] = FlatMappedRDD[3] at flatMap at <console>:15

scala> x.flatMap(y => y).collect()
res4: Array[String] = Array(a, b, c, d)

这篇关于如何扁平化与星火/斯卡拉的集合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆