获取嵌套json中的所有列名称 [英] Fetching all column names in nested json

查看:113
本文介绍了获取嵌套json中的所有列名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试读取嵌套的json文件.

I am trying to read a nested json file.

是否可以将所有列名称存储在此json文件中.

Is there any way to store all of the column names in this json file.

class ReadData {
    public static void main(String args[]) throws Exception{

        SparkConf conf = new SparkConf().setAppName("Search").setMaster("local[*]");
        JavaSparkContext sc= new JavaSparkContext(conf);
        SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);
        DataFrame df1 = sqlContext.read().json("TestData.json");
        df1.printSchema();
        String columns[]=df1.columns();
        int total_columns=columns.length;
        System.out.println("column names :");
        for(int i=0;i<total_columns;i++){
            System.out.println(columns[i]);
        }
   }

TestData.json的内容:

Contents of TestData.json :

{
    "id":"1",
    "name": {
      "first_name":"Joe",
      "last_name":"Thomas"
    }
}

我的代码的输出:

column names :

id
name

预期输出为:

column names :
id
name.first_name
name.last_name

推荐答案

以下是您的问题的可能解决方案.我已经尝试处理一些情况,但这应该可以解决问题.

Here is a possible solution to your problem. I have tried to handle some of the scenarios but this should do the trick.

package com.controller;

import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import org.json.JSONArray;
import org.json.JSONObject;

public class JSONColumnNameExtract {

static List<String> colNames;

public static void main(String[] args) {
    colNames = new ArrayList<String>();
    String jsonString = "";
    try {
        jsonString = readFile("C:\\jsonInput.json",
                StandardCharsets.UTF_8);
    } catch (IOException e) {
        e.printStackTrace();
    }
    JSONObject mainJObject = new JSONObject(jsonString);
    Iterator<?> keys = mainJObject.keys();
    while (keys.hasNext()) {
        String key = (String) keys.next();
        if (mainJObject.get(key) instanceof JSONArray) {
            JSONArray array = (JSONArray) mainJObject.get(key);
            for (int i = 0; i < array.length(); i++) {
                iterateJSON(array.get(i), key);
            }
            continue;
        }
        if (mainJObject.get(key) instanceof JSONObject) {
            iterateJSON(mainJObject.get(key), key);
        } else {
            if (!colNames.contains(key))
                colNames.add(key);
        }
    }
    for (String colName : colNames)
        System.out.println(colName);
}

private static void iterateJSON(Object object, String key2) {
    JSONObject jsonObject = ((JSONObject) object);
    Iterator<?> keys = jsonObject.keys();
    String key;
    while (keys.hasNext()) {
        key = (String) keys.next();
        if (jsonObject.get(key) instanceof JSONArray) {
            JSONArray array = (JSONArray) jsonObject.get(key);
            for (int i = 0; i < array.length(); i++) {
                iterateJSON(array.get(i), key);
            }
            continue;
        }
        if (jsonObject.get(key) instanceof JSONObject) {
            iterateJSON(jsonObject.get(key), key2 + "." + key);
        } else {
            if (!colNames.contains(key2 + "." + key))
                colNames.add(key2 + "." + key);
            continue;
        }
    }
}

static String readFile(String path, Charset encoding) throws IOException {
    byte[] encoded = Files.readAllBytes(Paths.get(path));
    return new String(encoded, encoding);
}

}

我输入的样本输入Json:

Sample Input Json I have taken:

{  
   "id":"1",
    "name":{  
      "first_name":"Joe",
      "last_name":"Thomas"
   },
   "address":[  
      {  
         "first_line":"Joe",
         "city":{  
            "city_name":"Bangalore",
            "city_pin":650659
         }
      },
      {  
         "first_line":"Joe",
         "city":{  
            "city_name":"Bangalore",
            "city_pin":650659,
            "city_pin2":65065933
         }
      }
   ]
}

输出:

address.first_line
address.city.city_name
address.city.city_pin
address.city.city_pin2
name.last_name
name.first_name
id

这篇关于获取嵌套json中的所有列名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆