使用Pandas Python将嵌套的JSON解析为多个数据框 [英] parsing nested JSON into multiple dataframe using pandas python

查看:1177
本文介绍了使用Pandas Python将嵌套的JSON解析为多个数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个嵌套的JSON,如下所示,并且想解析为python中的多个数据框..请帮助

I have a nested JSON as shown below and want to parse into multiple dataframe in python .. please help

{
"tableName": "cases",
"url": "EndpointVoid",
"tableDataList": [{
    "_id": "100017252700",
    "title": "Test",
    "type": "TECH",
    "created": "2016-09-06T19:00:17.071Z",
    "createdBy": "193164275",
    "lastModified": "2016-10-04T21:50:49.539Z",
    "lastModifiedBy": "1074113719",
    "notes": [{
        "id": "30",
        "title": "Multiple devices",
        "type": "INCCL",
        "origin": "D",
        "componentCode": "PD17A",
        "issueCode": "IP321",
        "affectedProduct": "134322",
        "summary": "testing the json",

        "caller": {
            "email": "katie.slabiak@spps.org",
            "phone": "651-744-4522"
        }
    }, {
        "id": "50",
        "title": "EDU: Multiple Devices - Lightning-to-USB Cable",
        "type": "INCCL",
        "origin": "D",
        "componentCode": "PD17A",
        "issueCode": "IP321",
        "affectedProduct": "134322",
        "summary": "parsing json 2",
        "caller": {
            "email": "testing1@test.org",
            "phone": "123-345-1111"
        }
    }],
    "syncCount": 2316,
    "repair": [{
            "id": "D208491610",
            "created": "2016-09-06T19:02:48.000Z",
            "createdBy": "193164275",
            "lastModified": "2016-09-21T12:49:47.000Z"
        }, {
            "id": "D208491610"
        }, {
            "id": "D208491628",
            "created": "2016-09-06T19:03:37.000Z",
            "createdBy": "193164275",
            "lastModified": "2016-09-21T12:49:47.000Z"
        }

    ],
    "enterpriseStatus": "8"
}],
"dateTime": 1475617849,
"primaryKeys": ["$._id"],
"primaryKeyVals": ["100017252700"],
"operation": "UPDATE"

}

我想对此进行解析并创建3个表/数据框/csv,如下所示..请帮助..

I want to parse this and create 3 tables/dataframe/csv as shown below.. please help..

此格式的输出表

推荐答案

我认为这不是最好的方法,但我想向您展示可能性.

I don't think this is best way, but I wanted to show you possibility.

import pandas as pd
from pandas.io.json import json_normalize
import json

with open('your_sample.json') as f:    
    dt = json.load(f)

表1

df1 = json_normalize(dt, 'tableDataList', 'dateTime')[['_id', 'title', 'type', 'created', 'createdBy', 'lastModified', 'lastModifiedBy', 'dateTime']]
print df1


            _id title  type                   created  createdBy  \
0  100017252700  Test  TECH  2016-09-06T19:00:17.071Z  193164275   

               lastModified lastModifiedBy    dateTime  
0  2016-10-04T21:50:49.539Z     1074113719  1475617849  

表2

df2 = json_normalize(dt['tableDataList'], 'notes', '_id')
df2['phone'] = df2['caller'].map(lambda x: x['phone'])
df2['email'] = df2['caller'].map(lambda x: x['email'])
df2 = df2[['_id', 'id', 'title', 'email', 'phone']]
print df2


            _id  id                                           title  \
0  100017252700  30                                Multiple devices   
1  100017252700  50  EDU: Multiple Devices - Lightning-to-USB Cable   

                    email         phone  
0  katie.slabiak@spps.org  651-744-4522  
1       testing1@test.org  123-345-1111  

表3

df3 = json_normalize(dt['tableDataList'], 'repair', '_id').dropna()
print df3


                    created  createdBy          id              lastModified  \
0  2016-09-06T19:02:48.000Z  193164275  D208491610  2016-09-21T12:49:47.000Z   
2  2016-09-06T19:03:37.000Z  193164275  D208491628  2016-09-21T12:49:47.000Z   

            _id  
0  100017252700  
2  100017252700  

这篇关于使用Pandas Python将嵌套的JSON解析为多个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆