如何在 PySpark 中的数据帧列中转换 JSON 字符串? [英] How to transform JSON strings in columns of dataframe in PySpark?

查看：36 发布时间：2021/11/14 23:08:42 apache-spark pyspark apache-spark-sql pyspark-sql

本文介绍了如何在 PySpark 中的数据帧列中转换 JSON 字符串?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 pyspark 数据框，如下所示

I have a pyspark dataframe as shown below

+--------------------+---+
|            _c0|_c1|
+--------------------+---+
|{"object":"F...|  0|
|{"object":"F...|  1|
|{"object":"F...|  2|
|{"object":"E...|  3|
|{"object":"F...|  4|
|{"object":"F...|  5|
|{"object":"F...|  6|
|{"object":"S...|  7|
|{"object":"F...|  8|

_c0 列包含一个字典形式的字符串.

The column _c0 contains a string in dictionary form.

<代码>'{ 对象": F"，时间": 2019-07-18T15:08:16.143Z"，值":[0.22124142944812775,0.2147877812385559,0.16713131964206696,0.3102800250053406,0.31872493028640747,0.3366488814353943，0.25324496626853943,0.14537988603115082,0.12684473395347595,0.13864757120609283,0.15222792327404022,0.238663449883461,0.22896413505077362,0.237777978181839]}"

如何将上述字符串转换为字典形式并获取每个键值对并将其存储到变量中?我不想把它转换成熊猫，因为它很贵.

How can I convert the above string to a dictionary form and fetch each key value pair and store it to a variables? I don't want to convert it to pandas as it is expensive.

如何在 PySpark 中的数据帧列中转换 JSON 字符串? [英] How to transform JSON strings in columns of dataframe in PySpark?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在 PySpark 中的数据帧列中转换 JSON 字符串? [英] How to transform JSON strings in columns of dataframe in PySpark?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭