转换到RDD JSON对象 [英] Convert RDD to JSON Object

查看:1115
本文介绍了转换到RDD JSON对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有类型的RDD RDD [(字符串,列表[字符串])。

例如:

 (水果,列表(苹果,香蕉,芒果))
(蔬菜,列表(马铃薯,番茄))

我想上面的输出为JSON对象转换像下面。

  {
  类别:
    {
      名:果,
      节点:
        {
          名:苹果
          isInTopList:假的
        },
        {
          名:香蕉,
          isInTopList:假的
        },
        {
          名:芒果
          isInTopList:假的
        }
      ]
    },
    {
      名:植物人,
      节点:
        {
          名:土豆,
          isInTopList:假的
        },
        {
          名:番茄,
          isInTopList:假的
        },
      ]
    }
  ]
}

请建议最好的方式做到这一点。

注:isInTopList:虚假始终保持恒定,并与在JSONObject的每一个项目是有


解决方案

首先我用下面的code重现你提到的情况:

  VAL sampleArray =阵列(
(果,列表(苹果,香蕉,芒果))
(植物人,列表(土豆,番茄)))VAL sampleRdd = sc.parallelize(sampleArray)
sampleRdd.foreach(的println)//打印结果

现在,我使用 json4s Scala库这个RDD转换成您所请求的JSON结构:

 进口org.json4s.native.JsonMethods._
进口org.json4s.JsonDSL.WithDouble._VAL JSON =类别 - > sampleRdd.collect()。toList.map {
情况下(名称,节点)=>
  (名,名称)〜
  (节点,nodes.map {
    名称= GT; (名,名称)
  })
}的println(紧凑型(渲染(JSON)))//打印呈现JSON

的结果是:

<$p$p><$c$c>{\"categories\":[{\"name\":\"FRUIT\",\"nodes\":[{\"name\":\"Apple\"},{\"name\":\"Banana\"},{\"name\":\"Mango\"}]},{\"name\":\"VEGETABLE\",\"nodes\":[{\"name\":\"Potato\"},{\"name\":\"Tomato\"}]}]}

I have an RDD of type RDD[(String, List[String])].

Example:

(FRUIT, List(Apple,Banana,Mango))
(VEGETABLE, List(Potato,Tomato))

I want to convert the above output to json object like below.

{
  "categories": [
    {
      "name": "FRUIT",
      "nodes": [
        {
          "name": "Apple",
          "isInTopList": false
        },
        {
          "name": "Banana",
          "isInTopList": false
        },
        {
          "name": "Mango",
          "isInTopList": false
        }
      ]
    },
    {
      "name": "VEGETABLE",
      "nodes": [
        {
          "name": "POTATO",
          "isInTopList": false
        },
        {
          "name": "TOMATO",
          "isInTopList": false
        },
      ]
    }
  ]
}

Please suggest the best possible way to do it.

NOTE: "isInTopList": false is always constant and has to be there with every item in the jsonobject.

解决方案

First I used the following code to reproduce the scenario that you mentioned:

val sampleArray = Array(
("FRUIT", List("Apple", "Banana", "Mango")),
("VEGETABLE", List("Potato", "Tomato")))

val sampleRdd = sc.parallelize(sampleArray)
sampleRdd.foreach(println) // Printing the result

Now, I am using json4s Scala library to convert this RDD into the JSON structure that you requested:

import org.json4s.native.JsonMethods._
import org.json4s.JsonDSL.WithDouble._

val json = "categories" -> sampleRdd.collect().toList.map{
case (name, nodes) =>
  ("name", name) ~
  ("nodes", nodes.map{
    name => ("name", name)
  })
}

println(compact(render(json))) // Printing the rendered JSON

The result is:

{"categories":[{"name":"FRUIT","nodes":[{"name":"Apple"},{"name":"Banana"},{"name":"Mango"}]},{"name":"VEGETABLE","nodes":[{"name":"Potato"},{"name":"Tomato"}]}]}

这篇关于转换到RDD JSON对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆