如何将字典列表转换为Pyspark DataFrame [英] How to convert list of dictionaries into Pyspark DataFrame

查看:656
本文介绍了如何将字典列表转换为Pyspark DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将我的词典列表转换为DataFrame.这是列表:

I want to convert my list of dictionaries into DataFrame. This is the list:

mylist = 
[
  {"type_activity_id":1,"type_activity_name":"xxx"},
  {"type_activity_id":2,"type_activity_name":"yyy"},
  {"type_activity_id":3,"type_activity_name":"zzz"}
]

这是我的代码:

from pyspark.sql.types import StringType

df = spark.createDataFrame(mylist, StringType())

df.show(2,False)

+-----------------------------------------+
|                                    value|
+-----------------------------------------+
|{type_activity_id=1,type_activity_id=xxx}|
|{type_activity_id=2,type_activity_id=yyy}|
|{type_activity_id=3,type_activity_id=zzz}|
+-----------------------------------------+

我假设我应该为每列提供一些映射和类型,但是我不知道该怎么做.

I assume that I should provide some mapping and types for each column, but I don't know how to do it.

更新:

我也尝试过:

schema = ArrayType(
    StructType([StructField("type_activity_id", IntegerType()),
                StructField("type_activity_name", StringType())
                ]))
df = spark.createDataFrame(mylist, StringType())
df = df.withColumn("value", from_json(df.value, schema))

但是随后我得到了null值:

+-----+
|value|
+-----+
| null|
| null|
+-----+

推荐答案

您可以这样做.您将获得一个包含2列的数据框.

You can do it like this. You will get a dataframe with 2 columns.

mylist = [
  {"type_activity_id":1,"type_activity_name":"xxx"},
  {"type_activity_id":2,"type_activity_name":"yyy"},
  {"type_activity_id":3,"type_activity_name":"zzz"}
]

myJson = sc.parallelize(mylist)
myDf = sqlContext.read.json(myJson)

输出:

+----------------+------------------+
|type_activity_id|type_activity_name|
+----------------+------------------+
|               1|               xxx|
|               2|               yyy|
|               3|               zzz|
+----------------+------------------+

这篇关于如何将字典列表转换为Pyspark DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆