如何从json字符串中提取值? [英] How to extract values from json string?

查看:58
本文介绍了如何从json字符串中提取值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含一堆列的文件,其中一个名为 jsonstring 的列是字符串类型,其中包含 json 字符串……假设格式如下:

I have a file which has bunch of columns and one column called jsonstring is of string type which has json strings in it… let's say the format is the following:

{
    "key1": "value1",
    "key2": {
        "level2key1": "level2value1",
        "level2key2": "level2value2"
    }
}

我想像这样解析此列:jsonstring.key1,jsonstring.key2.level2key1 以返回 value1, level2value1

I want to parse this column something like this: jsonstring.key1,jsonstring.key2.level2key1 to return value1, level2value1

我如何在 Scala 或 Spark sql 中做到这一点.

How can I do that in scala or spark sql.

推荐答案

你可以使用 withColumn + udf + json4s:

You can use withColumn + udf + json4s:

import org.json4s.{DefaultFormats, MappingException}
import org.json4s.jackson.JsonMethods._
import org.apache.spark.sql.functions._

def getJsonContent(jsonstring: String): (String, String) = {
    implicit val formats = DefaultFormats
    val parsedJson = parse(jsonstring)  
    val value1 = (parsedJson \ "key1").extract[String]
    val level2value1 = (parsedJson \ "key2" \ "level2key1").extract[String]
    (value1, level2value1)
}
val getJsonContentUDF = udf((jsonstring: String) => getJsonContent(jsonstring))

df.withColumn("parsedJson", getJsonContentUDF(df("jsonstring")))

这篇关于如何从json字符串中提取值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆