如何在spark scala中将json字符串解析为不同的列? [英] How to parse json string to different columns in spark scala?

查看:56
本文介绍了如何在spark scala中将json字符串解析为不同的列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

读取镶木地板文件时,这是以下文件数据

While reading parquet file this is the following file data

|id |name |activegroup|

|1  |abc  |[{"groupID":"5d","role":"admin","status":"A"},{"groupID":"58","role":"admin","status":"A"}]|

各个字段的数据类型

|--id : int
|--name : String
|--activegroup : String

activegroup 列是字符串爆炸功能不起作用.以下是所需的输出

activegroup column is string explode function is not working. Following is the required output

|id |name |groupID|role|status|
|1  |abc  |5d     |admin|A    |
|1  |def  |58     |admin|A    |

请帮我在 spark scala 最新版本中解析以上内容

Do help me with parsing the above in spark scala latest version

推荐答案

首先需要提取json模式:

First you need to extract the json schema:

  val schema = schema_of_json(lit(df.select($"activeGroup").as[String].first))

一旦你得到它,你就可以将你的 activegroup 列,它是一个 String 到 json (from_json),然后 explode 它.

Once you got it, you can convert your activegroup column, which is a String to json (from_json), and then explode it.

一旦该列是一个 json,您就可以使用 $"columnName.field"

Once the column is a json, you can extract it's values with $"columnName.field"

  val dfresult = df.withColumn("jsonColumn", explode(
                                      from_json($"activegroup", schema)))
                   .select($"id", $"name",
                           $"jsonColumn.groupId" as "groupId", 
                           $"jsonColumn.role" as "role", 
                           $"jsonColumn.status" as "status")

如果你想提取整个 json 并且元素名称对你来说没问题,你可以使用 * 来做:

If you want to extract the whole json and the element names are ok to you you can use the * to do it:

val dfresult = df.withColumn("jsonColumn", explode(
                               from_json($"activegroup", schema)))
            .select($"id", $"name", $"jsonColumn.*")

结果

+---+----+-------+-----+------+
| id|name|groupId| role|status|
+---+----+-------+-----+------+
|  1| abc|     5d|admin|     A|
|  1| abc|     58|admin|     A|
+---+----+-------+-----+------+

这篇关于如何在spark scala中将json字符串解析为不同的列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆