如何使用python扁平化JSON数组? [英] How to flatten a JSON array with python?

查看：166 发布时间：2021/2/13 21:07:15 python json list recursion

本文介绍了如何使用python扁平化JSON数组?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在处理JSON结构，其输出如下:

I'm dealing with a JSON structure, which is output as follows:

{
    "time": "2015-10-20T20:15:00.847Z",
    "name": "meta.response.ean",
    "level": "info",
    "data1": {
        "HotelListResponse": {
            "customerSessionId": "0AB29024-F6D4-3915-0862-DB3FD1904C5A",
            "numberOfRoomsRequested": 1,
            "moreResultsAvailable": true,
            "cacheKey": "-705f6d43:15086db3fd1:-4c58",
            "cacheLocation": "10.178.144.36:7300",
            "HotelList": {
                "@size": 2,
                "@activePropertyCount": 2,
                "HotelSummary": [{
                        "hotelId": 132684,
                        "city": "Seattle",
                        "highRate": 159.0,
                        "lowRate": 159.0,
                        "rateCurrencyCode": "USD",
                        "RoomRateDetailsList": {
                            "RoomRateDetails": {
                                "roomTypeCode": 10351,
                                "rateCode": 10351,
                                "roomDescription": "Standard Room, 1 Queen Bed",
                                "RateInfos": {
                                    "RateInfo": {
                                        "@promo": false,
                                        "ChargeableRateInfo": {
                                            "@averageBaseRate": 159.0,
                                            "@averageRate": 159.0,
                                            "@currencyCode": "USD",
                                            "@nightlyRateTotal": 159.0,
                                            "@surchargeTotal": 26.81,
                                            "@total": 185.81
                                        }
                                    }
                                }
                            }
                        }
                    }, {
                        "hotelId": 263664,
                        "city": "Las Vegas",
                        "highRate": 135.0,
                        "lowRate": 94.5,
                        "rateCurrencyCode": "USD",
                        "RoomRateDetailsList": {
                            "RoomRateDetails": {
                                "roomTypeCode": 373685,
                                "rateCode": 1238953,
                                "roomDescription": "Standard Room, 1 King Bed",
                                "RateInfos": {
                                    "RateInfo": {
                                        "@promo": true,
                                        "ChargeableRateInfo": {
                                            "@averageBaseRate": 135.0,
                                            "@averageRate": 94.5,
                                            "@currencyCode": "USD",
                                            "@nightlyRateTotal": 94.5,
                                            "@surchargeTotal": 9.45,
                                            "@total": 103.95
                                        }
                                    }
                                }
                            }
                        }
                    }
                ]
            }
        }
    },
    "context": {
        "X-Request-Id": "dca47992-b6cc-4b87-956c-90523c0bf3bb",
        "host": "getaways-search-app2",
        "thread": "http-nio-80-exec-12"
    }
}

如您所见，这些是嵌套数组.关于递归地展平它们有很多讨论.我无法展平HotelSummary下的数组.有什么想法吗?

As you can see, these are nested arrays. There is much discussion about flattening these recursively. I am unable to flatten the arrays under HotelSummary. Any ideas?

我想将JSON的一部分拼合为以下形式:

{  
   "customerSessionId":"0AB29024-F6D4-3915-0862-DB3FD1904C5A",
   "numberOfRoomsRequested":1,
   "moreResultsAvailable":"true",
   "cacheKey":"-705f6d43:15086db3fd1:-4c58",
   "cacheLocation":"10.178.144.36:7300",
   "size":2,
   "activePropertyCount":2,
   "hotelId":132684,
   "city":"Seattle",
   "highRate":159.0,
   "lowRate":159.0,
   "rateCurrencyCode":"USD",
   "roomTypeCode":10351,
   "rateCode":10351,
   "roomDescription":"Standard Room, 1 Queen Bed",
   "promo":"false",
   "averageBaseRate":159.0,
   "averageRate":159.0,
   "currencyCode":"USD",
   "nightlyRateTotal":159.0,
   "surchargeTotal":26.81,
   "total":185.81
}


{  
   "customerSessionId":"0AB29024-F6D4-3915-0862-DB3FD1904C5A",
   "numberOfRoomsRequested":1,
   "moreResultsAvailable":"true",
   "cacheKey":"-705f6d43:15086db3fd1:-4c58",
   "cacheLocation":"10.178.144.36:7300",
   "size":2,
   "activePropertyCount":2,
   "hotelId":263664,
   "city":"Las Vegas",
   "highRate":135.0,
   "lowRate":94.5,
   "rateCurrencyCode":"USD",
   "roomTypeCode":373685,
   "rateCode":1238953,
   "roomDescription":"Standard Room, 1 King Bed",
   "promo":"true",
   "averageBaseRate":135.0,
   "averageRate":94.5,
   "currencyCode":"USD",
   "nightlyRateTotal":94.5,
   "surchargeTotal":9.45,
   "total":103.95
}

我尝试使用flattenDict类.我没有得到所需格式的输出.

I have tried using flattenDict class. I am not getting the output in the desired format.

def flattenDict(d, result=None):
    if result is None:
        result = {}
    for key in d:
        value = d[key]
        if isinstance(value, dict):
            value1 = {}
            for keyIn in value:
                value1[".".join([key,keyIn])]=value[keyIn]
            flattenDict(value1, result)
        elif isinstance(value, (list, tuple)):   
            for indexB, element in enumerate(value):
                if isinstance(element, dict):
                    value1 = {}
                    index = 0
                    for keyIn in element:
                        newkey = ".".join([key,keyIn])        
                        value1[".".join([key,keyIn])]=value[indexB][keyIn]
                        index += 1
                    for keyA in value1:
                        flattenDict(value1, result)   
        else:
            result[key]=value
    return result

使用 `pandas` & `json_normalize` :

record_path是主要key要展平的参数
meta是用于附加keys进行展平的参数
json_normalize创建的列名包括所有keys到所需的key，因此长列名(例如RoomRateDetailsList.RoomRateDetails.roomTypeCode)
长列名称需要重命名为较短的版本
dict理解用于创建rename dict.
以下代码利用了 pathlib 库
.open是pathlib
也适用于非Windows路径

Use `pandas` & `json_normalize`:

record_path is the parameter for the main key to flatten
meta is the parameter for additional keys to flatten
json_normalize creates column names that include all keys to the desired key, hence the long column names (e.g. RoomRateDetailsList.RoomRateDetails.roomTypeCode)
Long column names need to be renamed to shorter versions
A dict comprehension is used to create a rename dict.
The following code takes advantage of the pathlib library
.open is a method of pathlib
Works with non-Windows paths too

import pandas as pd
import json
from pathlib import Path


# path to file
p = Path(r'c:\some_path_to_file\test.json')

# read json file
with p.open('r', encoding='utf-8') as f:
    data = json.loads(f.read())

# create dataframe
df = pd.json_normalize(data,
                    record_path=['data1', 'HotelListResponse', 'HotelList', 'HotelSummary'],
                    meta=[['data1', 'HotelListResponse', 'customerSessionId'],
                          ['data1', 'HotelListResponse', 'numberOfRoomsRequested'],
                          ['data1', 'HotelListResponse', 'moreResultsAvailable'],
                          ['data1', 'HotelListResponse', 'cacheKey'],
                          ['data1', 'HotelListResponse', 'cacheLocation'],
                          ['data1', 'HotelListResponse', 'HotelList', '@size'],
                          ['data1', 'HotelListResponse', 'HotelList', '@activePropertyCount']])

# rename columns:
rename = {value: value.split('.')[-1].replace('@', '') for value in df.columns}
df.rename(columns=rename, inplace=True)

# dataframe view
 hotelId       city  highRate  lowRate rateCurrencyCode  roomTypeCode  rateCode             roomDescription  promo  averageBaseRate  averageRate currencyCode  nightlyRateTotal  surchargeTotal   total                     customerSessionId numberOfRoomsRequested moreResultsAvailable                     cacheKey       cacheLocation size activePropertyCount
  132684    Seattle     159.0    159.0              USD         10351     10351  Standard Room, 1 Queen Bed  False            159.0        159.0          USD             159.0           26.81  185.81  0AB29024-F6D4-3915-0862-DB3FD1904C5A                      1                 True  -705f6d43:15086db3fd1:-4c58  10.178.144.36:7300    2                   2
  263664  Las Vegas     135.0     94.5              USD        373685   1238953   Standard Room, 1 King Bed   True            135.0         94.5          USD              94.5            9.45  103.95  0AB29024-F6D4-3915-0862-DB3FD1904C5A                      1                 True  -705f6d43:15086db3fd1:-4c58  10.178.144.36:7300    2                   2

# save to JSON
df.to_json('out.json', orient='records')

最终JSON输出:

[{
        "hotelId": 132684,
        "city": "Seattle",
        "highRate": 159.0,
        "lowRate": 159.0,
        "rateCurrencyCode": "USD",
        "roomTypeCode": 10351,
        "rateCode": 10351,
        "roomDescription": "Standard Room, 1 Queen Bed",
        "promo": false,
        "averageBaseRate": 159.0,
        "averageRate": 159.0,
        "currencyCode": "USD",
        "nightlyRateTotal": 159.0,
        "surchargeTotal": 26.81,
        "total": 185.81,
        "customerSessionId": "0AB29024-F6D4-3915-0862-DB3FD1904C5A",
        "numberOfRoomsRequested": 1,
        "moreResultsAvailable": true,
        "cacheKey": "-705f6d43:15086db3fd1:-4c58",
        "cacheLocation": "10.178.144.36:7300",
        "size": 2,
        "activePropertyCount": 2
    }, {
        "hotelId": 263664,
        "city": "Las Vegas",
        "highRate": 135.0,
        "lowRate": 94.5,
        "rateCurrencyCode": "USD",
        "roomTypeCode": 373685,
        "rateCode": 1238953,
        "roomDescription": "Standard Room, 1 King Bed",
        "promo": true,
        "averageBaseRate": 135.0,
        "averageRate": 94.5,
        "currencyCode": "USD",
        "nightlyRateTotal": 94.5,
        "surchargeTotal": 9.45,
        "total": 103.95,
        "customerSessionId": "0AB29024-F6D4-3915-0862-DB3FD1904C5A",
        "numberOfRoomsRequested": 1,
        "moreResultsAvailable": true,
        "cacheKey": "-705f6d43:15086db3fd1:-4c58",
        "cacheLocation": "10.178.144.36:7300",
        "size": 2,
        "activePropertyCount": 2
    }
]

这篇关于如何使用python扁平化JSON数组?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用python扁平化JSON数组? [英] How to flatten a JSON array with python?

问题描述

推荐答案

使用 `pandas` & `json_normalize` :

Use `pandas` & `json_normalize`:

最终JSON输出:

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用python扁平化JSON数组? [英] How to flatten a JSON array with python?

问题描述

推荐答案

使用 pandas & json_normalize :

Use pandas & json_normalize:

最终JSON输出:

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

使用 `pandas` & `json_normalize` :

Use `pandas` & `json_normalize`:

登录关闭