如何在spark scala中将json字符串解析为不同的列? [英] How to parse json string to different columns in spark scala?

查看：56 发布时间：2021/6/25 18:36:41 json scala apache-spark apache-spark-sql

本文介绍了如何在spark scala中将json字符串解析为不同的列?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

读取镶木地板文件时，这是以下文件数据

While reading parquet file this is the following file data

|id |name |activegroup|

|1  |abc  |[{"groupID":"5d","role":"admin","status":"A"},{"groupID":"58","role":"admin","status":"A"}]|

各个字段的数据类型

根

|--id : int
|--name : String
|--activegroup : String

activegroup 列是字符串爆炸功能不起作用.以下是所需的输出

activegroup column is string explode function is not working. Following is the required output

|id |name |groupID|role|status|
|1  |abc  |5d     |admin|A    |
|1  |def  |58     |admin|A    |

请帮我在 spark scala 最新版本中解析以上内容

Do help me with parsing the above in spark scala latest version

推荐答案

首先需要提取json模式:

First you need to extract the json schema:

  val schema = schema_of_json(lit(df.select($"activeGroup").as[String].first))

一旦你得到它，你就可以将你的 activegroup 列，它是一个 String 到 json (from_json)，然后 explode 它.

Once you got it, you can convert your activegroup column, which is a String to json (from_json), and then explode it.

一旦该列是一个 json，您就可以使用 $"columnName.field"

Once the column is a json, you can extract it's values with $"columnName.field"

  val dfresult = df.withColumn("jsonColumn", explode(
                                      from_json($"activegroup", schema)))
                   .select($"id", $"name",
                           $"jsonColumn.groupId" as "groupId", 
                           $"jsonColumn.role" as "role", 
                           $"jsonColumn.status" as "status")

如果你想提取整个 json 并且元素名称对你来说没问题，你可以使用 * 来做:

If you want to extract the whole json and the element names are ok to you you can use the * to do it:

val dfresult = df.withColumn("jsonColumn", explode(
                               from_json($"activegroup", schema)))
            .select($"id", $"name", $"jsonColumn.*")

结果

+---+----+-------+-----+------+
| id|name|groupId| role|status|
+---+----+-------+-----+------+
|  1| abc|     5d|admin|     A|
|  1| abc|     58|admin|     A|
+---+----+-------+-----+------+

这篇关于如何在spark scala中将json字符串解析为不同的列?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在spark scala中将json字符串解析为不同的列? [英] How to parse json string to different columns in spark scala?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在spark scala中将json字符串解析为不同的列? [英] How to parse json string to different columns in spark scala?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭