将List转换为DataFrame Spark Scala [英] Convert List into dataframe spark scala

查看:1319
本文介绍了将List转换为DataFrame Spark Scala的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含30多个字符串的列表.如何将列表转换为数据框. 我试过的:

I have a list with more than 30 strings. how to convert list into dataframe . what i tried:

例如

Val list=List("a","b","v","b").toDS().toDF()

Output :


+-------+
|  value|
+-------+
|a      |
|b      |
|v      |
|b      |
+-------+


Expected Output is 


  +---+---+---+---+
| _1| _2| _3| _4|
+---+---+---+---+
|  a|  b|  v|  a|
+---+---+---+---+

对此有任何帮助.

推荐答案

List("a","b","c","d")表示具有一个字段的记录,因此结果集在每一行中显示一个元素.

List("a","b","c","d") represents a record with one field and so the resultset displays one element in each row.

要获得预期的输出,该行应包含四个字段/元素.因此,我们将列表包裹为List(("a","b","c","d")),它代表一行,具有四个字段. 以类似的方式,具有两行的列表为List(("a1","b1","c1","d1"),("a2","b2","c2","d2"))

To get the expected output, the row should have four fields/elements in it. So, we wrap around the list as List(("a","b","c","d")) which represents one row, with four fields. In a similar fashion a list with two rows goes as List(("a1","b1","c1","d1"),("a2","b2","c2","d2"))

scala> val list = sc.parallelize(List(("a", "b", "c", "d"))).toDF()
list: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: string, _4: string]

scala> list.show
+---+---+---+---+
| _1| _2| _3| _4|
+---+---+---+---+
|  a|  b|  c|  d|
+---+---+---+---+


scala> val list = sc.parallelize(List(("a1","b1","c1","d1"),("a2","b2","c2","d2"))).toDF
list: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: string, _4: string]

scala> list.show
+---+---+---+---+
| _1| _2| _3| _4|
+---+---+---+---+
| a1| b1| c1| d1|
| a2| b2| c2| d2|
+---+---+---+---+

这篇关于将List转换为DataFrame Spark Scala的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆