Spark RDD将一行数据映射到多行 [英] Spark RDD mapping one row of data into multiple rows
本文介绍了Spark RDD将一行数据映射到多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个文本文件,其数据如下所示:
I have a text file with data that look like this:
Type1 1 3 5 9
Type2 4 6 7 8
Type3 3 6 9 10 11 25
我想将其转换为具有以下行的RDD:
I'd like to transform it into an RDD with rows like this:
1 Type1
3 Type1
3 Type3
......
我从一个案例类开始:
MyData[uid : Int, gid : String]
spark和scala的新手,我似乎找不到一个执行此操作的示例.
New to spark and scala, and I can't seem to find an example that does this.
推荐答案
似乎您想要这样的东西?
It seems you want something like this?
rdd.flatMap(line=>{
val splitLine = line.split(' ').toList
splitLine match{
case (gid:String) :: rest => rest.map(x:String =>MyData(x.toInt, gid))
}
}
这篇关于Spark RDD将一行数据映射到多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文