如何将地图的RDD转换为数据框 [英] How to convert an RDD of Maps to dataframe
本文介绍了如何将地图的RDD转换为数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有地图的RDD,我想将其转换为数据框 这是RDD的输入格式
I have RDD of Map and i want to converted it to dataframe Here is the input format of RDD
val mapRDD: RDD[Map[String, String]] = sc.parallelize(Seq(
Map("empid" -> "12", "empName" -> "Rohan", "depId" -> "201"),
Map("empid" -> "13", "empName" -> "Ross", "depId" -> "201"),
Map("empid" -> "14", "empName" -> "Richard", "depId" -> "401"),
Map("empid" -> "15", "empName" -> "Michale", "depId" -> "501"),
Map("empid" -> "16", "empName" -> "John", "depId" -> "701")))
有什么方法可以转换成数据框吗?
is there any way to convert into dataframe like
val df=mapRDD.toDf
df.show
empid, empName, depId
12 Rohan 201
13 Ross 201
14 Richard 401
15 Michale 501
16 John 701
推荐答案
您可以轻松地将其转换为Spark DataFrame:
You can easily convert it into Spark DataFrame:
这是可以解决问题的代码:
Here is a code that would do the trick :
val mapRDD= sc.parallelize(Seq(
Map("empid" -> "12", "empName" -> "Rohan", "depId" -> "201"),
Map("empid" -> "13", "empName" -> "Ross", "depId" -> "201"),
Map("empid" -> "14", "empName" -> "Richard", "depId" -> "401"),
Map("empid" -> "15", "empName" -> "Michale", "depId" -> "501"),
Map("empid" -> "16", "empName" -> "John", "depId" -> "701")))
val columns=mapRDD.take(1).flatMap(a=>a.keys)
val resultantDF=mapRDD.map{value=>
val list=value.values.toList
(list(0),list(1),list(2))
}.toDF(columns:_*)
resultantDF.show()
输出为:
+-----+-------+-----+
|empid|empName|depId|
+-----+-------+-----+
| 12| Rohan| 201|
| 13| Ross| 201|
| 14|Richard| 401|
| 15|Michale| 501|
| 16| John| 701|
+-----+-------+-----+
这篇关于如何将地图的RDD转换为数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文